


DUKE 
UNIVERSITY 





Digitized by the Internet Archive 
in 2021 with funding from 
Duke University Libraries 


https://archive.org/details/designofcomputer01symp 





The Design of 
Computer Simulation 
Experiments 





io noneeat) ene 
DraerT) ; OTIC: ‘Ae 


“Eada A: nw 


The Design of 
Computer Simulation 
Experiments 


edited by 
Thomas H. Naylor 


Duke University Press 
Durham, N. C. 1969 


© 1969, Duke University Press 
EChCeecand) mOm moe O09 0 
Sg le ING tA) S77 


Printed in the United 


States of America 


Math. 
5/0,'7834 
s$9729D 


Preface 


The College on Simulation and Gaming of The Institute of Vv 
Management Sciences and Duke University sponsored a symposium 
on "The Design of Computer Simulation Experiments" which was 
held at Duke University on October 14-16, 1968. This volume 
contains the papers which were presented at that symposium. 

The objective of the symposium, which was attended by 
nearly 250 persons, was to bring together some of the leading 
experimental design statisticians and some of the leading man- 
agement scientists and econometricians who are users of com- 
puter simulation techniques and provide them with an opportun- 
ity to explore the relevance of experimental design problems 
and techniques to the design of computer simulation experi- 
ments with models of business and economic systems. 

In the introductory paper by Naylor, Burdick, and Sasser, 
an attempt is made to define the general scope of the problem 
of designing computer simulation experiments. A number of 
specific experimental design problems are defined and several 
techniques for analyzing data generated by simulation experi- 
ments are described. The section on experimental designs 
begins with a survey paper by Hunter and Naylor and includes 
papers on the problem of factor selection, response surface 
designs, and sequential designs. Papers on regression analysis, 
analysis of variance, multiple ranking procedures, and time 
series analysis are included in the section on data analysis 
techniques. The paper by Timothy Ling represents an initial 
step in the direction of a theory of simulation. Ling uses 
his "Theory of Statics and Dynamics of Simulation" to tie to- 
gether the experimental design and data analysis aspects of 


simulation experiments. 


A number of different methodological problems are consi- 
dered in the next section. The paper by Howrey and Kelejian 
is probably the most original paper in the entire volume. It 
is concerned with the question of when to use simulation (as 
opposed to analytical techniques) to analyze the properties 
of econometric models. This paper contains several theoreti- 
cal developments which have not been published previously. 
Moy's paper on "Monte Carlo Techniques" proposes numerous 
practical Suggestions for reducing random error in simulation 
experiments with queueing models. Krasnow's critical appraisal 
of the experimental design and data analysis properties of 
existing simulation languages also represents a useful contri- 
bUtLon to, the Luteratwz7er 

Unfortunately, space limitations on this volume have per- 
mitted us to publish only three of the seven application papers 
which were presented at the symposium. Abstracts of the papers 
by Trueman, Stasch, Collins, and Boermeester are included, how- 
ever. The papers by Fromm and Evans are excellent state-of-the- 
art papers on the use of computer simulation experiments with 
large-scale econometric models. 

Numerous people contributed to the planning and implementa- 
tion of the symposium and the publication of this volume. How- 
ever, one person merits particular acknowledgment. Without the 
services of Mrs. Stephanie Goldsberry the symposium would not 
have taken place. In addition to spending countless hours over 
a twelve-month period on the planning of the symposium, she 
also typed the entire typescript from which this book was 
printed. We are also indebted to the National Science Founda- 
tion which provided partial financial support for the symposium 
through N.S.F. Grant GS-1926. 


Acknowledgments 


The Symposium on The Design of Computer Simulation leper i = 


ments held at Duke University on October 14-16, 1968 was made 


posiible by financial support from the following business 


firms: 


Amco Chemical Company 

Atlantic Richfield Company 

Booz-Allen Applied Research 

Carborundum Company 

Celanese Chemical Company 

Celanese Plastics Company 

Computer Usage Development Corporation 
Deering Milliken Service Corporation 

Dow Chemical Company 

Eastman Dillon, Union Securities § Company 
Eastman Kodak Company 

Ford Motor Company 

General Electric Company 

General Foods Corporation 

Grumman Aircraft Engineering Corporation 
Hallmark Cards, Inc. 

International Business Machines Corporation 
Kendall Textile Division 

McKinsey & Company, Inc. 

Mead Johnson Company 

Metropolitan Life Insurance Company 
Midwest Research Institute 

Minnesota Mutual Life Insurance Company 


National Energy Board of Canada 


Operations Research Incorporated 
Pillsbury Company 

Proctor § Gamble Company 

Research Triangle Institute 
Rexall Drug Company 

Scientific Software, Inc. 
Sinclair Oil Corporation 

Southern Pacific Company 

Sperry Rand Corporation (UNIVAC Division) 
Squibb Beech-Nut, Inc. 

Standard Oil of Indiana 

St. Regis Paper Company 

Sylvania Electric Products 

Union Bag-Camp Paper Corporation 
Westinghouse Electric Corporation 


Xerox Corporation 


Permission to Reprint 


The paper entitled, "The Design of Computer Simulation Ex- 
periments,'' was published originally as "Computer Simulation 
Experiments with Economic Systems: The Problem of Experimental 
Design,'' Journal of the American Statistical Association, 

LXII (December, 1967), 1315-1337. This paper is reprinted with 
minor modifications with the permission of the Editor. 

The paper entitled, "Response Surface Designs," was pub- 
lished originally as "Response Surface Techniques in Economics," 
Review of the International Statistical Institute, XXXVII (1969), 
18-35. This paper is reprinted with the permission of the 
Editors. 


Contents 


Introduction 


The Design of Computer Simulation Experiments 5 


Thomas H. Naylor, Donalds S. Burdick, and 
iN, Behell Sesser, dies 


Experimental Designs 


Experimental Designs oe 
J.S. Hunter and Thomas H. Naylor 


Factor Selection 59 
John L. Overholt 


Response Surface Designs 80 
Donald S. Burdick and Thomas H. Naylor 


Sequential Designs 99 
Herman Chernoff 


Data Analysis 
Regression Analysis and Analysis of Variance N23 
Harry Smith 


Selection and Ranking Procedures RS 


S.S. Gupta and S. Panchapakesan 


Selection and Ranking Procedures: A Comment ILI 
John S. Ramberg 


Time Series Analysis 
Donald Watts 


Statics and Dynamics of Simulation 
Timothy Y. Ling 


Methodological Problems 


Simulation Versus Analytical Solutions 
Philip Howrey and H.H. Kelejian 


Validation 


Richard Van Horn 


Monte Carlo Techniques: Theoretical 
D.C. Handscomb 


Monte Carlo Techniques: Practical 
William A. Moy 


Monte Carlo Techniques: A Comment 


Jack P. Kleijnen 


The Value of Sample Information 
Robert H. Hayes 


Simulation Languages 
Howard S. Krasnow 


Distributions of Blocks of Signs 





Norman R. Draper and Willard E. Lawrence 


Applications 
The Evaluation of Economic Policies 
Gary Fromm 


Non-Linear Econometric Models 


Michael K. Evans 


Zoe, 


298 


347 


369 


Complex Organizational Processes 
Manatee se tact tf 


Simulation Techniques (Abstract) 


Richard E. Trueman 


Multi-Dimensional Verification (Abstract) 
Stanley F. Stasch 


Life Insurance Models (Abstract) 
Rusisieli MM. Goldins),, Jr. 


Risk Theory Models (Abstract) 
John M. Boermeester 


413 





Introduction 


, 





The Design of Computer 
Simulation Experiments 


Thomas H. Naylor, Donald S. Burdick, and 
W. Earl Sasser, Jr., Duke University 


Introduction 


In a computer simulation experiment, as in any experiment, 
careful thought should be given to the problem of experiment- 
al design. Although a number of researchers have considered 
the need to utilize experimental design techniques in compu- 
ter simulation experiments and have noted the extensive lit- 
erature on the subject of experimental design, management 
scientists and economists have had little or nothing to say 
about the problem of designing computer simulation experi- 
ments with models of economic systems. For the most part, 
the existing experimental design literature is concerned with 
the problems and techniques of designing real world experi- 
ments, whereas computer simulation experiments are in effect 
experiments with a mathematical model. Very little attention 
has been paid in print to the problem of sifting the experi- 
mental design techniques which are relevant to the design of 
computer simulation experiments from those which are not. 

The task of deciding which material in the experimental 
design literature is applicable to computer simulation exper 
iments is extremely difficult. This situation is likely to 
be particularly acute to management scientists and economists 
who (prior to the advent of computer simulation) have had 
only limited opportunity to perform experiments with economic 
systems. For this reason, management scientists and econ- 
omists interested in designing simulation experiments with 
economic models are likely to encounter large amounts of mat- 
erial containing unfamiliar terms in surveying the experiment- 


al design literature. 


4 Naylor, Burdick, and Sasser 





With the aid of an example model, we shall attempt to show 
the relationship between existing experimental design and 
data analysis techniques and the design of computer simula- 
tion experiments with models of economic systems. Factorial 
designs and response surface designs are described and re- 
lated to simulation experiments. Two general techniques of 
data analysis, regression analysis and analysis of variance, 
are also defined and interpreted. Several special cases of 
the analysis of variance are considered and related to our 
example model. These data analysis techniques include the 
F-test, multiple comparison procedures, multiple ranking tech- 
niques, and spectral analysis. Sequential methods, incorpor- 
ating considerations of design and analysis, are also discus - 
sed. In the discussion of specific techniques, emphasis will 
be given to a description of the basic nature of the tech- 
niques and the situations or problems in which they can be 
usefully applied. With this information, the reader should 
then be able to recognize some of the situations in which an 
existing technique would be helpful to him in designing sim- 
ulation experiments and to consult the references for de- 
tailed instructions on how to implement the technique. In 
addition, several experimental design problems will be dis- 


cussed. 


Computer Simulation Defined 


We shall define simulation as a numerical technique for 
conducting experiments with certain types of mathematical and 
logical models describing the behavior of an economic system 
on a digital computer over extended periods of time [62]. 

The starting point of any computer simulation experiment with 
an economic system is a model of the system to be simulated. 
That is, we assume that a model has already been formulated 
and its parameters have been specified. The principal dif- 


ference between a simulation experiment and a "real world" 


The Design of Computer Simulation Experiments a 


experiment is that with simulation the experiment is conduct- 
ed with a model of the economic system rather than with the 
actual economic system itself. Given a model of an economic 
system, the design of a computer simulation experiment with 
the model usually requires that the analyst give special at- 
tention to the following four activities. 


Formulation of a Computer Program. The formulation of a 
computer program for the purpose of conducting computer 
simulation experiments with a model of an economic system 
requires that some consideration be given to the follow- 
ing: flow charts, computer program, error checking, data 
input and starting conditions, data generation (pseudoran- 
dum numbers and stochastic variates), and output reports 
GOA, Cehoiece 7/1\ 5 


Validation. The model must be validated by comparing sim- 
ulated data (data generated by the computer) with actual 
and historical data [63]. If the model does not pass this 
test, then changes must be made in the variables, paramet- 
er estimates, and structure of the model [61]. 


Experimental Design. Two different types of experimental 
objectives can be defined: (a) to find the combination of 
factor levels at which the response variable is optimized 
and (b) to explain the relationship between the response 
variable and the controllable factors in the experiments. 
The design of simulation experiments requires that we give 
careful attention to: the problem of stochastic conver- 
gence, the problem of size, the problem of motive, and the 
multiple response problem [18]. 


Analysis of Simulated Data. The analysis of data generat- 
ed by a computer model consists of the collection and pro- 
cessing of the simulated data, computation of test stat- 
istics, and interpretation of results and includes the use 
of such techniques as regression analysis and analysis of 
variance (F-test, multiple comparisons, multiple rankings, 
spectral analysis, and sequential analysis). 
This paper is concerned primarily with the third and four- 
th steps of the aforementioned procedure for designing simu- 


lation experiments. 


An Example Model 


In order to illustrate the problem of designing computer 
simulation experiments with economic systems we have chosen 


6 Naylor, Burdick, and Sasser 


eT 


a stochastic version of the Samuelson-Hicks [46,71] model as 
an illustrative example. This model has two principal attri- 
butes. First, it is a relatively simple model and well-known 
to economists. Second, although an analytical solution exists 
for certain special cases of this model, it still possesses 
many of the characteristics of more complex models which do 
not lend themselves to straight-forward analytical solutions. 
This section contains a description of the Samuelson-Hicks 
model and an outline of the procedure for treating it as a 
computer simulation model. For the sake of completeness, we 
also briefly describe an analytical solution to the model. 
Specifically, the model consists of the following param- 


eters, and functional relationships: 


Parameters 

(1) b = accelerator coefficient 

(23) cy = consumption coefficient for period T-1 
(3) Co = consumption coefficient for period T-2 
(4) g = governmental parameter 


Exogenous Variables 


(5) UpGVvr = stochastic variates with known probability 
distributions 


Endogenous Variables 


(6) Cy = consumption in period T 

(7) I, = investment in period T 

(8) Gr = governmental expenditure in period T 
(9) Yr = national income in period T 

Identity 

(10) Yr = Co + Ip + Gop 


Operating Characteristics 


(11) Cp = €y¥o_g + Co%pe2 * Up 


The Design of Computer Simulation Experiments ih 


(G29) lee = b(Yy_4 = Yr_2) + Ve 
(13) Gop et 

By substituting the values of Crs Ip, and Gy given by equa- 
trons (11), G'2), and (G3), respectively, into equation (10), 
equation (10) can be rewritten in the form of a second-order, 


stochastic, linear difference equation, 


Bete ae ie 2, Sie. * 

where, 

(GES) Dea (cy aD) 

(16) a, =b- Cc, 

(17) Wp = Up + Ve 

and Up and Vy are independent random variables. If we let 


Yr denote deviations of national income from its equilibrium 
value, then the final form of (14) which determines the time 


path of national income can be expressed as 


5 aa a alc Sa ee a 

The complete solution to (18) is given by [1,48]: 
T T ae 

CS Yr = key * kota * ae A 


where TY) and r, are the ‘characteristic roots of ('8)); kK) and 
k, are arbitrary constants determined by the initial condi- 
tions, and hs is given by 
ee a tt 
peewee) TRE 

The solution for the time path of national income is com- 
posed of two parts--a transient response and a stochastic re- 
sponse. The usual procedure for determining the dynamic pro- 
perties of the solution of difference-equation models in econ- 
omics is to surpress the stochastic part of the solution and 
to analyze only the deterministic solution. This is equival- 
ent to looking at the expected value of the time path of na- 
tional income in our model. Referring to a more general econ- 
ometric model which contains given non-stochastic exogenous 
variables, Philip Howrey has stated that, 


8 Naylor, Burdick, and Sasser 





Two kinds of information are obtained from the determin- 
istic system. . The values of the roots of the determinant- 
al equation yield information about the modulus and per- 
iodicity of the transient response. If the roots are all 
less than unity in absolute value, the system is stable 
and approaches the particular solution from any set of in- 
itial conditions. If complex roots occur this is usually 
taken as an indication that the system will tend to oscil- 
late. The periodicity and rate of damping of the sinusoid- 
al components contributed by complex roots can be ascer- 
tained from these roots. Dynamic multipliers may be cal- 
culated to determine the response of the endogenous vari- 
ables to changes in the exogenous variables. 


There is little doubt that these methods provide inter- 
esting and useful information about the system of equa- 
tions. For short-term forecasting and the formulation of 
discretionary stabilization policy, these techniques may 
provide a sufficient characterization of the dynamic pro- 
perties of the model. If, however, the longer-term pro- 
perties of the model are to be investigated, it may not be 
reasonable to disregard the impact of the disturbance 
terms on the time paths of the endogenous variables. Nei- 
ther of the above techniques provides information about 
the magnitude or correlation properties of deviations from 
the expected value of the time path. 

In an earlier paper, Howrey [48] showed that disregarding the 
disturbance term in the Samuelson-Hicks model may be quite 
misleading. He demonstrated that stabilization policies de- 
signed to increase the stability of the system by reducing 
the modulus of the roots may in fact increase the variance of 
the system. 

If our model were a simultaneous equation model (and non- 
recursive), nonlinear, and/or of higher order than two, then 
analytical solutions become increasingly difficult and the 
benefits from using a computer to generate the time paths of 
Yr increase considerably. Although one could clearly perform 
experiments with our simple example model without a computer, 
it does serve to illustrate many of the experimental design 
problems which are associated with more complex econometric 
models involving higher order nonlinear systems of difference 


equations. 


The Design of Computer Simulation Experiments 9 





Experimental Design Terminology 


The two most important terms in the language of experiment- 
al design are factor and response. Both terms refer to vari- 
ables. Whether a variable in a particular experiment is a 
factor or a response depends upon the role played by the var- 
iable in the experiment in question. To illustrate the dif- 
ference between a factor and a response, suppose we have two 
variables, X and Y. If our experiment is designed to answer 
the question, how does a change in X affect Y, then X is a 
factor and Y is a response. In an experiment with a computer 
simulation model a response must of necessity be an endogen- 
ous (output) variable, whereas a factor will normally be a 
parameter or an exogenous (input) variable or some property 
of its probability distribution. 

For example, in the Samuelson-Hicks model the parameters 


by cy, ¢ and g could be treated as factors, as well as the 


> 
cts, variance, and probability distribution of the 
stochastic variates uy and Vr: The responses could be con- 
sumption, investment, governmental spending, or national in- 
come. 

A large percentage of the terms and concepts in the theory 
of experimental design results from classification of the 
factors in the experiment by the following dichotomous ques- 
tions: 

Le isthe factor in iquestion controlled or not? 


2. Are the values (levels) of the factor observed or 
not? 


3. Is the effect of the factor a subject for study or 
is the factor included merely to increase the pre- 
cision of the experiment? 


4. Are the levels of the factor quantitative or quali- 
tative? 


5. Is the factor fixed or random? 
A factor is controlled if its levels are purposefully sel- 
ected by the experimenter. (In the case of economic models, 
the experimenter is usually called a decisionmaker or policy- 


10 Naylor, Burdick, and Sasser 


EOE ——L—iCiCiT 


maker.) In our example model the accelerator coefficient may 
be directly or indirectly controlled by governmental monetary 
policy. On the other hand, wars, foreign competition, labor 
strikes, and national disasters are factors which might af- 
fect national income but which may not be subject to control 
by economic policymakers. 

In most experiments there will be a number of factors 
which are controllable but which are not controlled. For ex- 
ample, under a military dictatorship the marginal propensity 
to consume could presumably be tightly controlled, but in a 
free society the marginal propensity to consume is likely to 
be treated as a factor which is not subject to dimecesconmmnot 
by governmental policymakers. Although consumption coeffici- 
ents may be indirectly affected by changes in b and g, we 
shall assume that cy and c, are given. 

A factor is observed if its levels are observed or meas- 
ured and recorded as part of the data. More often than not 
the observed factors consist of just the controlled factors 
in a particular experiment, but there are frequent exceptions. 
It is unwise to control a factor without observing it, but 
an uncontrolled factor may often be observed. In our example, 
wars and strikes, although uncontrolled, can be observed. 
Observations on uncontrolled factors are often called concom- 
itant observations. In the analysis of data concomitant ob- 
servations should be treated differently from observations on 
controlled factors. The analysis of covariance is a tech- 
nique of data analysis which utilizes concomitant observa- 
tions. Although concomitant observations are useful, in the 
real world it is never possible to observe all the factors 
which might affect a given response. 

The distinction between factors which are of basic inter- 
est and those which are included to increase precision is an 
important distinction for it serves to emphasize the fact 
that for almost all experiments the factors of basic interest 
are not the only ones to significantly affect the outcome. 


The Design of Computer Simulation Experiments eal 





In the literature controlled factors which are included to in- 
crease precision are often cailed block factors and their lev- 
els are called blocks. In computer simulation experiments one 
never has uncontrolled or unobserved factors. The role which 
uncontrolled and unobserved factors play in the real world is 
played in a computer simulation model by the random character 
of exogenous variables. The effects or variations in re- 
sponse which these factors cause in the real world have been 
incorporated in the computer simulation model in the form of 
experimental errors or random deviations. Once we have a 
model, the factors are determined, and it is not possible in 
an experiment on the model to identify additional factors as 
sources of variation. 

A factor is quantitative if its levels are numbers which 
are expected to have a meaningful relationship with the re- 
sponse. Otherwise a factor is qualitative. The parameters 
be Cy» Co» and g in our experiment are all quantitative fac- 
tors. Although the expected values and variances of up and 


W, are quantitative factors, the type of probability distri- 


a tron would be a qualitative factor. If part of the input 
to a simulation model consists of a decision rule on economic 
policy, and if several policies are under consideration, the 
policy could be a qualitative factor. 

When an experimenter is investigating the effect of a fac- 
tor on a response, he will be interested in drawing infer- 
ences with respect to a certain range or population of levels 
FOmmchemGdctOrs elie alienthics Neviel Swot simteresit ob sa particu 
lar factor are included in the experiment, that factor is 
said to be fixed. If, however, the levels of a factor that 
are actually included in the experiment constitute a random 
(or representative) sample from the population of levels in 
which the experimenter is interested, then the factor is said 
to be random. A random factor may be regarded as a fixed 
factor if the inferences drawn from the data are restricted 
to the levels of the factor which are actually included in 


12 Naylor, Burdick, and Sasser 





the experiment. The notion of random factors permits infer- 
ences of a probabilistic nature to be made about factor lev- 
els which do not actually appear in the experiment. The tech- 
niques for accomplishing these inferences do not require that 
the factor be quantitative. In fact, for quantitative fac- 
tors much more powerful techniques (curve fitting and regres- 
sion analysis) are available. Discussions of this concept 

can be found in chapter 10 of the book by Hicks [45] and in 
the paper by Eisenhart [34]. 


The Analysis of Variance 


In a well-designed experiment consideration must be given 
to methods of analyzing the data once it is obtained. Most 
of the classical experimental design techniques described in 
the literature are used in the expectation that the data will 
be analyzed by one or both of the following two methods: an- 
alysis of variance and regression analysis. The analysis of 
variance is a collection of techniques for data analysis 
which are appropriate when qualitative factors are present, 
although quantitative factors are not excluded. Regression 
analysis is a collection of techniques for data analysis 
which utilizes the numerical properties of the levels of 
quantitative factors. From a mathematical point of view the 
distinction between regression and the analysis of variance 
is somewhat artificial. For example, an analysis of variance 
can be performed as a regression analysis using dummy vari- 
ables which can assume only the values zero or one. A treat- 
ment of the relationship between regression and the analysis 
of variance can be found in the book by Graybill [42]. An 
excellent treatise on the application of regression analysis 
has been written by Draper and Smith [31]. 

The great bulk of experimental design techniques described 
in the literature have the analysis of variance as the in- 


tended method of data analysis. As an illustration of the 


The Design of Computer Simulation Experiments eS 





analysis of variance let us consider a computer simulation ex- 
periment with the Samuelson-Hicks model in which monetary pol- 
icy and fiscal policy are qualitative factors, and national 
income is the response variable. Suppose there are six mon- 
etary policies and six fiscal policies under consideration. 
Associated with each monetary policy is an accelerator coef - 


ficient and associated with each fiscal policy is a govern- 


mental parameter. The accelerator coefficients are denoted 
by b(1), b(2), ..., b(6) and the governmental parameters by 
2(2)y eaessf0O). Lm other words.the monetary policy and fis- 


cal policy factors have six levels each. A basic experiment- 
al design calls for the collection of data (generated by the 
computer) for each of the six monetary policies in combina- 
filonmwith cach of the six fiscal polaceies,., This basic de- 
sign is called the factorial design for two factors. It is 
customary to present this design in a two-way table as in 


Rapume 


Monetary Fiscal Policy 
Policy 





14 Naylor, Burdick, and Sasser 


me 


Each one of the thirty-six cells or boxes in Figure 1 cor- 
responding to the thirty-six combinations of monetary DOL 
cies with fiscal policies represents a population of possible 
observations. For example, if monthly national income is the 
response variable, we might imagine a population of monthly 
national income data for all months during which fiscal pol- 
icy g(4) might conceivably be used with monetary policy (A) 
Of course, the actual experimental data will contain only a 
sample (e.g., 24 months' data from this population). 

If we are interested in investigating the effects of the 
factors on the response, a logical first question to ask is, 
"Do the factors have any effect at all on the response?" The 
statement that the factors have no effect is a statement 
about the thirty-six populations in the experiment. It says 
that these thirty-six populations are all the same. We can 
therefore rephrase our logical first question to, "Do the 
(thirty-six) populations of our experiment differ, or are 
they all the same?" 

We still may not have the question we really want to ask 
of the data. There are many ways in which populations can 
vary, and we usually are not interested in all these ways. 
The population mean is an aspect of populations in which we 
are likely to be interested. A more suitable question might 
be, "Do the means of the (thirty-six) populations Ot OUlIaexG= 
periment differ, or are they all the same?" The analysis of 
variance is a tool for answering this question. 

To answer questions about means of populations one Can and 
usually does look at means of random samples from these pop- 
ulations. However, one cannot conclude that population means 
differ simply by noting that the corresponding sample means 
differ. The random character of sample means makes it virt- 
ually certain that two sample means will differ even when 
the corresponding population means are the same. In order to 
infer that the population means differ we must first measure 


the magnitude of random fluctuations. Such a measurement aS) 


The Design of Computer Simulation Experiments 1S 


I 


obtained from the variation between observations in the same 
sample or cell (the within-cell variance). If the population 
means are in fact equal, then these limits will seldom Demexs 
ceeded by the sample means of the cells. Therefore, if our 
data shows that these limits are exceeded, we can infer that 
the population means are probably different. This type of 
inference, established by comparing between-cell variance to 
within-cell variance, is the essence of the analysis of vari- 
ance. 

Of course, it is not enough to state that the factors in 
toto affect the response. We are very much interested in 
identifying and measuring the effects which individual fac- 
tors have on the response. For example, suppose the popula- 
tion means for all the cells in any one column are the same, 
but the population means differ from column to column. In 
our example if columns represent fiscal policies as in Fag 
ure 1, this would mean that different monetary policies in 
combination with the same fiscal policy would result in the 
same national income, but that different fiscal policies 
yield different national incomes. In this case we would say 
that the fiscal policy factor affects the response, but the 
monetary policy factor does not. 

In order to separate the effects due to the two factors ave 
is customary to consider row and column means. In our ex- 
ample a row mean would be the average national income associ- 
ated with a particular monetary policy (row) when used with 
all six fiscal policies (columns). A column mean would be 
the average national income associated with a particular fis- 
cal policy and all six monetary policies. 

We have reached a point from which it would be difficult 
to continue without introducing some notation. Let ME de- 
note the average national income associated with monetary pol- 
icy i when combined with fiscal policy j. Let ie denote the 
average national income associated with monetary polucy, ay sin 


combination with all six fiscal policies (the ith row mean), 


16 Naylor, Burdick, and Sasser 


a 


and let Y . denote the average national income associated with 
fiscal policy j for all six monetary policies (the jth column 
mean). The average for all thirty-six cells, Y , is called 
the grand mean. 4 

The main effect for a particular row (or column) is defin- 
ed to be the deviation of the corresponding row (or column) 
mean from the grand mean. Thus, the main effect for monetary 
DOMMG yids Tel my and the main effect for fiscal policy j 
is Y .-Y . Suppose as suggested above, that for any one 
fiscal policy the average national income for each of the six 
monetary policies is the same. If the means are the same for 
the six monetary policies in combination with any one fiscal 
policy they will be the same for the six monetary policies 
when averaged over fiscal policies. Therefore, we will have 
Tote or tay ee 
equal, they will also be equal to their average ys9 the 


ats ie . Since all rows means are 
grand mean, and therefore vie - ay = 0 for each i) in 
other words the row main effects are all zero. On the other 
hand, if national income varies from fiscal policy to fiscal 
policy, the averages over monetary policies will also and 
the column means will differ. Since the column means differ, 
they cannot all be equal to their average, the grand mean, 
so some column mean effects must be non-zero. Thus, by look- 
ing at main effects, we can obtain information regarding the 
relative importance of the factors. 

If the main effects told the whole story, then each cell 
mean could be represented as the sum of the grand mean, a 
row main effect, and a column main effect (i.e., Teg = Me 
+ Cave) + Cia-Vaie The fact that this) as notMernde mm 
general can be simply illustrated. Suppose the even-numbered 
monetary policies result in above average national income 
when used with even-numbered fiscal policies but below aver- 
age national income when used with odd-numbered fiscal pola 
cies, whereas odd-numbered monetary policies yield above av- 
erage national income when used with odd-numbered fiscal pol- 


The Design of Computer Simulation Experiments U7 


me 


icies but below average national income when used with even- 
numbered fiscal policies. (This example is admittedly arti- 
ficial and unrealistic, but it serves well to illustrate the 
point in question.) Each monetary policy will yield above 
average national income for half the fiscal policies and be- 
low average national income for the other half. Therefore, 
the national income averaged over fiscal policies will be the 
same for each monetary policy, and the row main effects will 
be zero. Similarly, each fiscal policy will yield above aver- 
age national income for half the monetary policies and below 
average national income for the other half, which implies 
that the column main effects are also zero. If the equation 
cs = Ye + Ce ex») + Goi) held in general, then each 
cell mean would have to be equal to the grand mean whenever 
all main effects are zero. In the example just discussed, 
however, some cell means are above the grand mean and others 
are below it even though all main effects are zero. 

The difference between a cell mean and the value predicted 
from the grand mean and the main effects, given by et 
- err 1) - Care = ie - we - "3 + ie: is calile 
an interaction effect. It is also customary to speak of a 
two-factor interaction between the fiscal policy and monetary 
policy factors. This terminology is redundant in an experi- 
ment involving only two factors. However, in experiments 
with more than two factors interactions involving three or 
more factors can occur, and two-factor interactions can occur 
between any pair of factors in the experiment. 

In the absence of interaction the equation “oe = Y 


+ (avatar ) - Gee ») will hold. If the average national 
income associated with monetary policy i is 3 billion dollars 
above the overall average, and if the average national income 
associated with fiscal policy j is 2 billion dollars below 

the overall average, then, in the absence of interaction, we 
can predict that the average national income associated with 


monetary policy i in combination with fiscal policy j will be 


18 Naylor, Burdick, and Sasser 


EE 


3 - 2 = 1 billion dollars above the overall averages in 
other words the’ performance of monetary policy i in combina- 
tion with fiscal policy j can be predicted from a measure- 
ment (i.e., main effect) on monetary policy i only and a meas- 
urement on fiscal policy j only. Thus, the absence of inter- 
action implies that the factors have a certain kind of in- 
dependence. This independence is not statistical independ- 
ence, but an independence in the way in which the factors af- 
fect the response. Thus, the absence of interaction means 
that the effect it has on the response may be studied and 
measured separately for each of the factors, and these sep- 
arate or independent determinations may be used to predict 
the response at any combination of levels for the factors in 
question. When this independence fails (as it did in the 
above example, since an average monetary policy in combina- 
tion with an average fiscal policy could produce an above 

(or below) average result), then interaction will be present. 

The absence of interaction implies even more than inde- 
pendence of the factors. It implies that the effects of the 
factors are additive. In other words the average national 
income associated with monetary policy i in combination with 
fiscal policy j is the sum of an overall average, an effect 
for monetary policy i, and an effect for fiscal policy j. 

If, instead, the average national income were the product of 
an overall average, an effect for monetary policy i, and an 
effect for fiscal policy j, then interaction would be pre- 
sent even though the factors retain their independence. 

An interaction which is caused by nonadditivity of inde- 
pendent factors can often be removed by a suitable transfor- 
mation of the data. For example, if effects are multiplica- 
tive when output data is used, then additivity can be re- 
stored by using logarithms of outputs as the mode of expres- 
sion for the data. Further reading on additivity and the use 
of transformations can be found in [2, 11, 15]. 

Many experimenters habitually conclude that the presence 


The Design of Computer Simulation Experiments ie 


a ee ___._ 


of interaction implies that the factors are not independent 
without giving any consideration to the possibility of in- 
dependent but nonadditive factors. This practice is inad- 
visable and should be avoided. 

From our consideration of the two-factor example it should 
be clear that three-factor and higher order interactions can 
be defined in a strictly analogous manner (although the alge- 


bra becomes increasingly complex). 


Some Specific Techniques of Data Analysis 


Several special cases of the analysis of variance are con- 
sidered in this section. These techniques include the F-test, 
multiple comparisons, multiple rankings, spectral analysis, 


and sequential sampling. 


F-Test 


Suppose that we are interested in testing the null hypoth- 
esis that the expected values of national income for each of 
the thirty-six monetary-fiscal policy combinations are equal 
in Figure 1. The F-test is a straightforward procedure for 
testing hypotheses of this type. If the null hypothesis is 
accepted in our example experiment, then one tentatively con- 
cludes that the sample differences between policies are at- 
tributable to random fluctuations rather than to actual dif- 
ferences in population values (expected values of national 
income). On the other hand, if the null hypothesis is re- 
jected, then further analysis, such as multiple comparisons 
and multiple rankings, is recommended. The F-test rests on 
three important assumptions (1) normality, (2) equality of 
variance, and (3) statistical independence. The papers by 
Naylor, Wertz, and Wonnacott [65,67] contain two applica- 
tions of the use of the F-test to analyze data generated by 


simulation experiments. 


20 Naylor, Burdick, and Sasser 


OO 


Multiple Comparisons 


In the preceding section we described a procedure for test- 
ing hypotheses about the equality of population means. Typ- 
ically, decisionmakers are interested not only in whether 
alternatives differ but also in how they differ. Multiple 
comparison and multiple ranking procedures often become tools 
relevant to meeting the latter query, for they have been de- 
signed specifically to attack questions of how means of many 
populations differ. Like the F-test, both of these proce- 
dures assume normality, common variance and statistical in- 
dependence. 

In contrast with the analysis of variance, multiple com- 
parison methods emphasize the use of confidence intervals 
rather than the testing of hypotheses. For example, if one 
is interested in comparing the means of different popula- 
tions, then a number of (100 - a)% confidence intervals for 
the differences between population means may be constructed. 
Scheffé [73] and Winer [77] have written comprehensive sur- 
veys of multiple comparison procedures. Naylor, Wertz and 
Wonnacott [65,67] have applied multiple comparisons to the 


analysis of output data from simulation experiments. 


Multiple Rankings 


Frequently, the objective of computer simulation experi- 
ments with economic systems is to find the "best," "second 
best," "third best,'' etc. policy. Although multiple compar- 
ison methods of estimating the sizes of differences between 
plans (as measured by population means) are often used as a 
way of attempting, indirectly, to achieve goals of this type, 
multiple ranking methods represent a more direct approach to 
a solution of the ranking problem. 

The best estimate of the rank of a set of economic poli- 
cies is simply the ranking of the sample means associated 
with the given policies. Because of random error, however, 


sample rankings may yield incorrect results. With what pro- 


The Design of Computer Simulation Experiments Zale 


bability can we say that a ranking of sample means represents 
the true ranking of the population means? It is basically 
this question which multiple ranking procedures attempt to 
answer. 

Bechhofer, Dunnett, and Sobel [4] have developed a proce- 
dure for selecting a single population and guaranteeing with 
PRODAbDM Int yey thatthe selected population Ws the "best pxro- 
vided some other condition on the parameters is satisfied. 
This procedure assumes normality, statistical independence, 
and a common unknown variance. It has been used by Naylor, 
Wertz, and Wonnacott [65] with simulation experiments with a 
model of a multi-process firm to evaluate the profitability 


of alternative managerial plans and strategies. 


Spectral Analysis 


Spectral analysis is a statistical technique frequently 
employed in the physical sciences and more recently applied 
by economists to analyze the behavior of economic time ser- 
tes [4154454351561 68575]. There are at least four reasons 
why one might want to consider spectral analysis as a tech- 
nique for analyzing data generated by simulation experiments 
with an economic model. 

First, data generated by computer simulation experiments 
anesusualadly highly autocormelated, /6.9.. GNP im period tis 
lnikelyetoubeshaghiyeconnelateds with GNP ane perlods tk. | Vit 
is well known that when autocorrelation is present in sample 
data that the use of classical statistical estimating tech- 
niques (which assume the absence of autocorrelation) will 
lead to underestimates of sampling variances (which are un- 
duly large) and inefficient predictions. Several methods are 
available for treating this problem: (1) Simply ignore auto- 
correlation and compute sample means and variances over time 
thereby incurring the aforementioned statistical problems. 
(2) Divide the sample record length into intervals that are 


longer than the interval of major autocorrelation and work 


22 Naylor, Burdick, and Sasser 


I 


with the observations on these supposedly independent inter- 
vals [37]. This method suffers from the fact thityas time 
choices of sample record length and sampling interval seem to 
have neither enough prior nor posterior justification in most 
cases to make this choice much more than arbitrary," [37]. 
(3) Replicate the simulation experiment and compute sample 
means and variances across the ensemble rather than over 
time. This method may lead to excessive computer running 
time and fail to yield the type of information that is desir- 
ed about a particular time series. (4) Replace classical 
statistical theory with a sampling theory such as spectral 
analysis in which the probabilities of component outcomes in 
a time series depend on previous outcomes in the series. 

With spectral analysis the problems associated with methods 
(1) and (2) can bevsuccesstuily avoided without replicating 
the experiment. 

Second, for purposes of describing the behavior of a sto- 
chastic variate (such as GNP) over time, the information con- 
tent of spectral analysis is greater than that of sample 
means and variances. "When one studies a stochastic process, 
he is interested in the average level of activity, devia- 
tions from this level, and how long these deviations last, 
once they occur," [37]. Spectral analysis provides this 
kind of information. 

Third, with spectral analysis it is relatively easy to con- 
struct confidence bands and to test hypotheses for the pur- 
pose of comparing the simulated results of the use of two or 
more alternative economic policies. Frequently, it is impos- 
sible to detect differences in time series generated by sim- 
ulation experiments when one restricts himself to simple 
graphical analysis. Spectral analysis provides a means of 
objectively comparing time series generated with a computer 
model. 

Fourth, spectral analysis can also be used as a technique 


for validating an econometric model of an economic system. 


The Design of Computer Simulation Experiments 23 


eel 


By comparing the estimated spectra of simulated data and cor- 
responding real-world data one can infer how well the simula- 
tion resembles the system it was designed to emulate [37]. 

Fishman and Kiviat [37] have written a path breaking arti- 
cle on the use of spectral analysis in analyzing data gener- 
ated by computer simulation models. The books by Blackman 
and Tukey [6] and Granger and Hatanaka [41] and the papers by 
Jenkins [51] and Parzen [68] are recommended for obtaining 
the basic elements of spectral analysis. Tukey [75] has writ- 
ten a paper in which spectral analysis and the analysis of 
variance are compared in detail. Naylor, Wertz, and Wonna- 
cott [66] have applied spectral analysis to the analysis of 
simulation experiments with econometric models. Spectral an- 
alysis has also been used to compare data generated by compu- 
ter simulation experiments with a model of the textile indus- 
try with corresponding real world data as a technique of ver- 
ification 64]: 


Sequential Sampling 


Smee Computer time is not a free pitt of matune, data gen- 
erated by computer simulation experiments (observations) are 
costly. The cost of experimentation may be greatly reduced 
if at each stage of the simulation experiment the analyst 
balances the cost of additional observations (generated by 
the computer) against the expected gain in information from 
such observations. With computer simulation experiments the 
objective of sequential sampling is to minimize the number of 
observations (sample size) for obtaining the information 
which is required from the experiment. Rather than setting 
in advance the number of observations to be generated, the 
sample size n is considered a random variable dependent on 
EheROuUECOMe Oe thes tansten-Teobsenvations. | in terms ot com= 
puter time, the cost of a simulation run is minimized by gen- 
erating only enough observations to achieve the required re- 


sults with predetermined accuracy. 


24 Naylor, Burdick, and Sasser 


a 


For example, a sequential test on our model could be de- 
signed to determine if the national incomes obtained by using 
a certain fiscal policy in combination with various monetary 
policies differ significantly. The sequential method sets a 
procedure for deciding at the ith observation whether to ac- 
cept a given hypothesis, reject the hypothesis, or continue 
sampling by taking the (i+l)th observation. Such a procedure 
must specify for the ith observation a division of the i-dim- 
ensional space of all possible observations into three mutu- 
ally exclusive and exhaustive sets: an area of preference A; 
for accepting the hypothesis, and area of preference Bs for 
rejecting it, and an area of indifference C; where no state- 
ment can be made about the hypothesis and further observa- 
tions are necessary. The fundamental problem in the theory 
of sequential sampling is that of a proper choice of the 
sets - A; B;, and C; [76]. 

Although Wald's Sequential Analysis [76] is the best known 
reference on sequential procedures, chapter 34 of Kendall and 
Stuart [56] contains a comprehensive treatment of this topic. 
The article by Chernoff [19] is also worthy of consideration. 
In addition, the optimization procedures developed by Kiefer 
and others [32,58,59,60] appear to offer promise in analyz- 


ing data generated by computer models. 


Experimental Design Problems 


In this section we describe four problems which arise in 
the design of simulation experiments and identify some Of the 
techniques which have been developed to solve them. The four 
experimental design problems include: (1) the problem of sto- 
chastic convergence; (2) the problem of size; (3) the problem 


of motive; and (4) the multiple response problem. 


The Problem of Stochastic Convergence 


Most experiments are intended to yield information about 


The Design of Computer Simulation Experiments 25 





population quantities or averages such as average investment 
or national income per month for a particular combination of 
monetary and fiscal policies. As estimates of population 
averages the sample averages we compute from several runs on 
a computer will be subject to random fluctuations and will 
not be exactly equal to the population averages. However, 
the larger the sample (i.e., the more runs we observe), the 
greater the probability that the sample averages will be very 
close to the population averages. The convergence of sample 
averages for increasing sample size is called stochastic con- 
vergence, 

The problem of stochastic convergence is that it is slow. 
A measure of the amount of random fluctuation inherent in a 
chance quantity is its standard deviation. If o is the stan- 
dard deviation of a single observation, then the standard de- 
viation of the average of n observations is o/Yn. Thus, in 
order to halve the random error one must quadruple the sample 
Suze mento decrease! the nandom erroy by a factor ten, one 
must increase the sample size by a factor of one hundred. It 
can easily happen that a reasonably small random error re- 
quires an unreasonably large sample size. 

Because of the slowness of stochastic convergence we are 
led to seek methods other than increasing sample size to re- 
duce random error. In real world experiments error reduction 
techniques commonly involve including factors such as blocks 
or concomitant variables which are not of basic interest to 
the experimenter. If some of these factors, instead of being 
uncontrolled and unobserved, can be controlled or observed, 
then) sche veEtects: willlno: Longer iGontribmte to the random 
error, and the standard deviation o of a single observation 
Willebe reduced 

In a computer simulation experiment on a given model it is 
not possible to include more factors for error reduction pur- 
poses. The inclusion of more factors requires a change in 


the model. Once the model has been specified, all the uncon- 


26 Naylor, Burdick, and Sasser 





trolled factors have been irretrievably absorbed in the pro- 
babilistic specifications for the exogenous inputs. 

There are, however, error reduction techniques which are 
suitable for computer simulation experiments. They are call- 
ed Monte Carlo techniques [22,33,43,53]. The underlying 
principle of Monte Carlo techniques is the utilization of 
knowledge about the structure of the model, properties of 
the probability distributions of the exogenous inputs, and 
properties of the observed variates actually used for inputs 
to increase the precision (i.e., reduce random error) in the 
measurement of averages for the response variables. 

Hammersley and Handscomb [43] have written an excellent 
book on the subject of Monte Carlo techniques. Some of the 
techniques they discuss are importance sampling, control var- 
jates, correlation (i.e., regression methods and antithetic 
variate methods), and conditional Monte Carlo. The book also 


contains an extensive bibliography. 


The Problem of Size 


What we have called the problem of size arises in both 
real world and computer simulation experiments. It could 
just as easily be called "the problem of too many factors." 
In a factorial design for several factors the number of cells 
required is the product of the number of levels for each of 
the factors in the experiment. Thus, if our experiment with 
the Samuelson-Hicks model were extended to a four factor ex- 
periment involving 6 levels for b, 5 levels for Cy, 5 levels 
for Co and 0) Neves’ for ig5 va total ot i6—ac Siexgi one) 
cells would be required for a full factorial design. If we 
included the expected value, variance, and probability distri- 
bution of ur and Vp as PACGOnS a SwelelaSm DI, Cy,» o> and g, 
we would have ten factors. Even if we only used two levels 
for each of these factors, the full factorial experiment 
would require Te = 1024 cells, “lt as) clean ithait sthem-aulelagde 


sign can require an unmanageably large number of cells if 


The Design of Computer Simulation Experiments 2 


eee 


more than a few factors are to be investigated. 

If we require a complete investigation of the factors in 
the experiment, including main effects and interactions of 
all orders, then there is no solution to the problem of size. 
If, however, we are willing to settle for a less than com- 
plete investigation, perhaps including main effects and two- 
factor interactions, then there are designs which will ac- 
complish our purpose and which require fewer cells than the 
full tactonial. Fractional factorial designs, including 
Latin square and Greco-Latin square designs, are examples of 
designs which require only a fraction of the cells required 
by the full factorial design. 

So far the problem of size reduction has been discussed in 
an analysis of variance framework. As was mentioned in the 
section on analysis of variance, this collection of tech- 
miques. for data analysis (i.e.., the analysis of variance) is 
appropriate when the factors are qualitative. However, if 
the factors Xy> Xp0 c++, Xp are quantitative, and the re- 
sponse y is related to the factors by some mathematical func- 
tion f, then regression analysis, rather than the analysis of 
variance, may be an appropriate method of data analysis. The 
functional relationship y = £(x,, Sean x) between the re- 
sponse and the quantitative factors is called the response 
surface [9,23,29]. Least squares regression analysis 1s a 
method for fitting a response surface to observed data in 
such a way as to minimize the sum of squared deviations of 
the observed responses from the value predicted from the fit- 
ted response surface. 

For an experiment which utilizes regression analysis to 
explore a response surface a factorial design or a fractional 
factorial design may not be optimal. Several authors, pri- 
marily George Box [16,17], have developed designs called re- 
sponse surface designs which are appropriate when response 
surface exploration via regression analysis is the aim of the 


experiment. An important advantage of the response surface 


28 Naylor, Burdick, and Sasser 





designs in comparison with comparable factorial designs is 
the reduction in the required size of the experiment without 
a corresponding reduction in the amount of information ob- 
tained. 

Response surface designs have not been given the attention 
they deserve in most of the books on experimental design. An 
exception is chapter 8A in the second edition of the book by 
Cochran and Cox [23]. Fortunately, there are a number of 
readable journal articles on response surface designs, includ- 
ing Box and Draper [12], Box and J.S. Hunter [13], and Box 
and W.G. Hunter [14]. The recent paper by Hill and W.G. Hun- 
ter [47] contains a survey of response surface designs and a 
complete bibliography. Response surface designs were used by 
Hufschmidt [49] to design computer simulation experiment with 
a model of a water-resource system. Austin C. Hoggatt has 
used response surface designs with simulation experiments 


with a computer model of a market [69]. 


The Problem of Motive 


The experimenter should specify his objectives as precise- 
ly as possible to facilitate the choice of a design which 
will best satisfy his objectives. Two important types of ex- 
perimental objectives can be identified: (1) the experiment- 
er wishes to find the combination of factor levels at which 
the response variable is maximized (or minimized) in order to 
optimize some process, (2) the experimenter wishes to make a 
rather general investigation of the relationship of the re- 
sponse to the factors in order to determine the underlying 
mechanisms governing the process under study. The distance= 
tion between these two aims is less important when the fac- 
tors are qualitative than it is when the factors are quanti- 
tative. Unless certain interactions can be assumed to be 
zero, the only way to find the combination of levels of qual- 
itative factors which will produce an optimum response is to 


measure the response at all combinations of factor levels (Gakic 


The Design of Computer Simulation Experiments Zo 


Com the ule factorial) destien)i Eveni af anterdetions: are 
assumed negligible in an experiment with qualitative factors, 
the design is likely to be the same whether the aim is to op- 
EMNRUZeMOG TOVexp Lone. 

In an experiment with quantitative factors the picture is 
(Umcenditte nent. Heres w thencontanulty of the response sur- 
face can usually be used to guide us quickly and efficiently 
to a determination of the optimum combination of factor lev- 
els. There are two commonly used sampling methods for find- 
ing the optimum of a response surface: systematic sampling 
and random sampling. Systematic sampling methods include: 
(>! jthesundtorm-erid om factorial methods (2) the sangite-fac- 
tor method; (3) the method of marginal analysis; and (4) the 
MethoOavoLesiteepesii ascent. the artcle: by Hutschmidt [49] 
contains a case study involving the use of both systematic 
and random sampling methods for the design of a simulation 
experiment. A detailed description of Several o£ these meth=- 
ods can be found in Cochran and Cox [23]. 

When general exploration of a response surface is the aim, 
HES Sic whtEOmIdemtiisyy abest wexperimentaladesaon) be- 
cause general exploration is usually a less precisely speci- 
fied goal than optimization. However, we can state a guid- 
ing principle: when the aim of an experiment is to further 
general knowledge and understanding, it is important to give 
careful and precise consideration to the existing state of 
knowledge and to questions and uncertainties upon which we 
desire mENCmexpemimentaleaddtd EOS nedesome ugh to. Anmexce 1 — 
Kenta paper On sene Uses or experiments tO) further peneral under — 
standing, including the role played by experimental design, 
is the one by Box and W.G. Hunter [14]. 


The Multiple Response Problem 

Before concluding this paper we should mentionone other 
experimental design problem. The multiple response problem 
arises when we wish to observe many different response vari- 


30 Naylor, Burdick, and Sasser 
eee ee ae 
ables in a given experiment. The multiple response problem 

occurs frequently in computer simulation experiments with 
economic systems. In the Samuelson-Hicks model, consumption, 
investment, governmental expenditures, and national income 
might be treated as response variables. Computer simulation 
experiments with the complete Brookings Quarterly Econometric 
Model of the United States involve several hundred response 
variables. 

It is often possible to bypass the multiple response pro- 
blem by treating an experiment with many responses as many 
experiments each with a single response. Or several re- 
sponses could be combined (e.g., by addition) and treated as 
a single response. However, it is not always possible to by- 
pass the multiple response problem; often multiple responses 
are inherent to the situation under study. Unfortunately, 
experimental design techniques for multiple response experi- 
ments are virtually nonexistent. 

Any attempt to solve the multiple response program is 
likely to require the use of utility theory. Gary Fromm [39] 
has taken an initial step in this direction by using utility 
theory to evaluate the results of policy simulation experi- 
ments with the Brookings Model. The specific problem with 
which Fromm was confronted was how to choose among alterna- 
tive economic policies which affect a large number of differ- 


ent response variables in many different ways. 


Bibliography 


1. Bartlett, M.S. An Introduction to Stochastic Processes 
with Special Reference to Methods and Applications. 


Cambridge: The University Press, 1962. 


2. Bartless, M.S. "The Use of Transformations ," Biometrics, 
Te GUSAT Apo oS 


3. Bechhofer, R.E. "A Single Sample Multiple Procedure for 
Ranking Means of Normal Populations with Known Vari- 
ances," Annals of Mathematical Statistics, XXV (1954), 
LO= SSF 


The Design of Computer Simulation Experiments 31 





4. 


ON 


ee 


eye 


1ESie 


14. 


Wc 


Ig 


Who 


Sy 


Bechhofer, Robert E., Dunnett, C.W., and Sobel, Milton. 
"A Two-Sample Multiple Decision Procedure for Ranking 
Means of Normal Populations with a Common Unknown Vari- 
AniGes eb Ome taka Nn CUSISA) i 7017/6). 


Bechhofer, Robert E., and Sobel, Milton. "A Single-Samp- 
le Multiple Decision Procedure for Ranking Variances 
of Normal Populations," Annals of Mathematical Statist- 
USSioy NAVINGLIS A). 2735-280! 


Blackman, R.B., and Tukey, J.W. The Measurement of Power 
Spectra. New York: Dover Publications, Inc., 1958. 
Bonini, Charles P. Simulation of Information and Deci- 


SsHion systems) sim) the) Farming Vewood leit fs SN Jey: 
Prentice-Hall, Ines; L963. 


BOX Gee MUMtintactom Wesmoms: ol Farst Order. “Balo- 
MMeereauhcels VOOCIDG (CANS) 79) 5 UNS —lsy7/ & 
Bose, CinleiniPs Wiles lepgallicirenenttoin ehoval Exqollestieeedoi Che INS jorenise 


Surfaces: Some General Considerations and Examples," 
Buomne traces, X (954), 6-60). 


Box, G.E.P., and Behnken, D.W. "Some New Three Level De- 
Signs for the Study of Quantitative Variables," Tech- 
MONS eeveS _ IIL (CUS(6(0))) 5 BSS aa yet e 


BOxn Gaim sandeiGon. Don. “An Analysis’ of Trans forma - 


tions," Journal of the Royal Statistical Society B, 
NOCVOL a (LSC) emacs 


Box, G.Eee., and Draper, Nok. A Basis for the Selection 
of a Response Surface Design," Journal of the American 


Sivas Calas Ocratnonenlslve (SO) pGIazeob ae 


BOX GE ee anGnunten jon Multi —tactom Experimental 
Designs for Exploring Response Surfaces,' Annals of 
Maithemabiicaliis tatusitics:, xvii iOS 7)y MOS 2i4ie 


Box, G.E.P.> and Hunter, William G. "The Experimental 
Study of Physical Mechanisms," Technometrics, VII 
GUSIGS) Reo 


BOx pe Gel. De and acdwell PW iranstoniatron o£ the 
Independent Variables," Technometrics, IV (1962), 531- 
DIDO} 


Box, G:E.Pe 5 and Wilson, K:-B. “On the Experimental At- 
tainment of Optimum Conditions,' Journal of the Royal 
Statustreal Socvety B,, XITL, (95a), 1-45, 

BOSS Colles o ehnl Wome, WoW, Milne IExsol@seziien@in auovel lng oiloyi= 
tation of Response Surfaces: An Example of the Link Be- 


tween the Fitted Surface and the Basic Mechanism of the 
SHVSEG,Y Biomeertes 5 MIU (GlOSS)) , ZIV oA, 


Burdick, Donald S., and Naylor, Thomas H. "Design of Com- 
puter Simulation Experiments for Industrial Systems," 


32 


ZO 


LN Gs 


Zi2ie 


Cor 


24. 


Zon 


26. 


ile 


28. 


Dis 


30. 


Sills 


Ste 


Sr 


34. 


Naylor, Burdick, and Sasser 





Communications of the ACM, (1966), 329-339. 


Chernoff, Herman. "Sequential Design of Experiments," 
Annals of Mathematical Statistics, XXX (September, 1959), 
775-770). 


Chew, Victor (ed.). Experimental Design in Industry. 
New York: John Wiley §& Sons, 1958. 


Chu, Kong, and Naylor, Thomas H. "'A Dynamic Model of the 
Firm,'’ Management Science, XI (May, 1965), 736-750. 


Clark, C.E. "Importance Sampling in Monte Carlo Analy- 
sis," Operations Research, IX (1961), 6035-620. 


Cochran, W.G., and Cox, G.M. Experimental Designs. New 
York: John Willey & Sons 7 957% 


Cohen, Kalman J. Computer Models of the Shoe, Leather, 
Hide Sequence. Englewood Cliffs, N.J.: Prentice-Hall, 


Ince LUSIO Os 


Conway, R.W. "Some Tactical Problems in Digital Simula- 
tion,'' Management Science, X (October, 19163) 5 47-01% 


Conway, R.W., Johnson, B.M., and Maxwell, W.L. "Some 
Problems of Digital Systems Simulation," Management 
Science, VI (Octobems, L959) 99 2 tor 


Cox, D.R. Planning of Experiments. New York: John Wiley 
G Sons WSisise 


Cyert, Richard M., and March, James G. A Behavioral 
Theory of the Firm. Englewood Cliffs, N.J.: Prentice- 
Hal aie reo ose 


Davies, O.L. (ed.). Design and Analysis of Industrial 
Experiments. New York: Hafner Publishing Co., 1960. 


Dear, R.E. ‘Multivariate Analysis of Variance and Co- 
variance for Simulation Studies Involving Normal Time 
Series,'' System Development Corporation, FN-5644, No- 
vember, 1961. 


Draper, N.R., and Smith, H. Applied Regression Analysis. 
New York: John Wiley § Sons, 1966. 


Dvoretzky, A., Kiefer, J., and Wolfowitz, J. “Sequential 
Decision Problems for Processes with Continuous Time 
Parameter: Problems of Estimation,'' Annals of Mathemat- 
cal JS tatasiteacs). XOXCVN(GLSIGS) AOS nese 


Ehrenfield, S., and Ben-Tuvia, S. “The Efficiency of 
Statistical Simulation Procedures," Technometrics, IV 
Gieny5 UWOOZ) 5 2572756 


Eisenhart, Churchill. "The Assumptions Underlying the 
Analysis of Variance," Biometrics, III (1) (March, 
Is\Ary)) . b= 2il- 


The Design of Computer Simulation Experiments BS) 


ne EET nnn enna SU 


Sie 


510}. 


Sil. 


tg 


So} 


40. 


41. 


42. 


43. 


44, 


45. 


46. 


47. 


48. 


49, 


Fisher, Ronald A. The Design of Experiments. London: O1- 
iver and Boyd, 1951. 


Fishman, George S. "Problems in the Statistical Analysis 
of Simulation Experiments: The Comparison of Means and 
the Length of Sample Records," Communications of the 
ACM, X (February, 1967), 94-99. 


Fishman, George S., and Kiviat, Philip J. '"'Spectral An- 
alysis of Time Series Generated by Simulation Models," 
Management Science, XIII (March, 1967), 525-557. 


"Fractional Factorial Designs for Factors at Two and 
Three Levels,'' U.S. Department of Commerce, National 
Bureau of Standards, Applied Mathematics Series 58, U. 
S. Government Printing Office, Washington, D.C., Sep- 
tember aly Sous: 


Fromm, Gary. “An Evaluation of Monetary Policy Instru- 
ments,'' Paper presented at the annual meetings of the 
Econometric Society, San Francisco, December, 1966. 


Gatarian AvVian and Ancker, (©.J. “Mean Value Estimation 
From Digital Computer Simulation," Operations Research, 
(January-February, 1966). 


Granger, C.W.G., and Hatanaka, M. Spectral Analysis of 
Economic Time Series. Princeton, N.J.: Princeton Uni- 


versity Press, 1964. 


Graybill, Franklin A. An Introduction to Linear Stat- 
istical Models, I. New York: McGraw-Hill Book Co., Inc., 
1961. 


Hammersley, J.M., and Handscomb, D.C. Monte Carlo Meth- 
ods. New York: John Wiley § Sons, 1964. 


Hannan, E.J. Time Series Analysis. New York: John Wiley 
and Son's), L960% 


Hicks, Charles R. Fundamental Concepts in the Design of 
Experiments. New York: Holt, Rinehart, §& Winston, 1964. 


Haeks. JR. WAGentribution to the Theory of the Trade 
Cycle. Oxford: Clarendon Press, 1950. 


Hill, William J., and Hunter, William G. "A Review of 
Response Surface Methodology: A Literature Survey," 
Technometrics, VIII (November, 1966), 571-590. 


Howney. i. Philip. Stabilization Policy in Linear /Sito- 
chastic Systems," Unpublished paper, Econometric Re- 
search Program, Princeton University, January, 1966. 
(Mimeographed). 


Hufschmidt, M.M. "Analysis by Simulation: Examination of 


Response Surface,'' Design of Water-Resource Systems. 
Edited by Arthur Maass, et al. Cambridge: Harvard Uni- 


versity Press, 1966. 


34 


SO) E 


Bris 


Diare 


53), 


54. 


Biore 


56. 


Sic 


58. 


Bole 


60. 


Glee 


62. 


63. 


64. 


65. 


Naylor, Burdick, and Sasser 


Jacoby, J.H., and Harrison, S. "Multi-Vanzablegeaperse 
mentation and Simulation Models,'' Naval Research Log- 


istics Quarterly, IX (1962), 121-156. 


Jenkins, G.M. "General Considerations in the Analysis of 
Spectra,'' Technometrics, III (May, 1961), 153-166. 


Kagidis, J., and Lackner, M.R. "Introduction to Manage- 
ment Control Systems Research," System Development 
Corporation, TM-708/000/00 (October 15, 1962). 


Kahn, Herman, and Mann, Irwin. "Monte Carlo," The RAND 
Corporation, P-1165 (July 30, 1957). 


Kempthorne, Oscar. The Design and Analysis of Experi- 
ments. New York: John Wiley §& Sons, 1952. 


Kendall, M.G. "On Autoregressive Time Series," Biomet- 
mill, DROOL (UNoysabisie , IG)Aiai)) ALOR) = IA « 


Kendall, M.G., and Stuart, Alan. The Advanced Theory of 
Statistics, Il: Inference and Relationship. New York: 
Hafner Publishing Co., 1961. 


Kendall, M.G., and Stuart, Alan. The Advanced Theory of 
Statistics, II1:-Design and Analysis, and Time-Series. 
New York: Hafner Publishing Co., 1966. 


Kiefer, J. "Sequential Minimax Search for a Maximum," 


Proceedings of the American Mathematical Society, (June, 
MISS) 


Kiefer, J. "Invariance, Minimax Sequential Estimation, 
and Continous Time Processes," Annals of Mathematical 
Sreeensenmes , LOCI (MEbyel., I957/) f S750 - 


Kiefer, J., and Sacks, J. "Asymptotically Optimum Se- 
quential Inference and Design," Annals of Mathematical 
SitatlstL1es)., (oepeemben yy) Ol6s) i Ole soUke 


Malinvaud, E. Statistical Methods of Econometrics. Chi- 
cago: Rand McNally, 1966. 


Naylor, Thomas H., Balintfy, Joseph L., Burdick, Donald 


S., and Chu, Kong. Computer Simulation Techniques. 
New York: John Wiley § Sons, 1966. 


Naylor, Thomas H., and Finger, J.-M.) "Verifiicataoneoe 
Computer Simulation Models,"' Management Science, (Octo- 
been LOG ro Zee O ne 


Naylor, Thomas H., Wallace, William H., and Sasser, W. 
Earl. "A Computer Simulation Model of the Textile In- 
dustry," Journal of the American Statistical Associa- 
tion, XLII (December, 1967), 1338-1364. 


Naylor, Thomas H., Wertz, K., and Wonnacott, Thomas H. 
"Methods for Analyzing Data from Computer Simulation 
Experiments," Communications of the ACM, X (November, 
IMO), JOS=7/LOe 


The Design of Computer Simulation Experiments 35 


rE nnyEInE EEE nEEEE EINE EEE EEE Oct 


66. Naylor, Thomas H., Wertz, K., and Wonnacott, Thomas H. 


67. 


68. 


69. 


Or 


Wha 


Vali 


HSK. 


74. 


Sc 


76. 


lies 


"Spectral Analysis of Data Generated by Simulation 
Experiments with Econometric Models," Econometrica, 
CONoien Ik, UO). 


Naylor, Thomas H., Wertz, K., and Wonnacott, Thomas H. 
"Methods for Evaluating the Effects of Economic Poli- 
cies Using Simulation Experiments," Review of the In- 


ternational Statistical Institute, (1968). 


Parzen, Emanuel. ''Mathematical Considerations in the 
EStamatrom ot, Spectra,’ Technometrics, Ill (May, 1961), 
WO = U9O 


Preston, Lee E., and Collins, Norman R. Studies in a 
Simulated Market, Research Program in Marketing, Grad- 
uate School of Business Administration, University of 
California, Berkeley, 1966. 


Quenouille, M.H. The Design and Analysis of Experiments. 
New York: Hafner Publishing Co., 1953. 


Samuelson, Paul A. "Interactions Between the Multiplier 
Analysis and the Principle of Acceleration," Review of 
FeoOmOmic Sieetseres 5, Mul (Wlehy, WIS, W5-1se 


Sasser, W. Earl, and Naylor, Thomas H. ''Computer Simula- 
tion of Economic Systems: An Example Model," Simulation, 
Nal len (Cramutciravaemel OO smecelm S12. 


Scheffé, Henry. The Analysis of Variance. New York: John 
KWeilesy ¢ Sons, LOSOe 


ochern ys (Dr nemArtnOtoimulatwon. Pramncecon,, New.) Ue 
Van Nostrand Co., 1963. 


Tukey, John W. "Discussion Emphasizing the Connection 
Between Analysis of Variance and Spectral Analysis," 
NeChmonGicees , WW OMe, UG@il)), WeylanA0). 


Wald, A. Sequential Analysis. New York: John Wiley § 
SOms),e SAWe 


Waineie, Boll, Sieeesienceall WieriMeij los stil lepgaenesbiemecil IES 
sign. New York: McGraw-Hill BOOM COMMMELOIO2n 





jten/ im? tekst, oe 


—— a, 





= 


i. : = Th ae 


‘ See¢ 
, gee 
av 
Li Vbsan eee 
a, 
[oa A 
ult; SAC 
. 2 ee? U 


| Gat 
jo ae 


i] 
oa Mme)! 
i ] a a 


; ‘folder 


Experimental Designs 


learns 





Experimental Design 


J. S. Hunter, Princeton University, and 
Thomas H. Naylor, Duke University 


Simulation Defined 


Let Y denote some output variable of a system which we wish 
to study, and let X denote the k variables which are thought 


to influence Y according to the functional relationship 

(1) Y = 6(X) 

In the experimental design literature Y is said to be a re- 
sponse and the X,"s Giese Aen) iaiceurSiarlic. to) bie stacronsr. 
The function ¢ is called a response surface. A special case 
of (1) is the simple linear model 

C2) Y= s 
i=l 
where the 65's are parameters. If experimentation were possible, 


Ope 
ee: 


one could vary the e observe Y, determine the estimates of the 


parameters 6, and then interpret the fitted model 


a 


Opa 
cle! 
=] 


where 6; and Y denote estimates of 05 and Y respectively. 


My 


(3) Y= 


foe 


Unfortunately, it is frequently impossible or impractical to 
perform controlled experiments with business and economic sys- 
tems. However, it may be possible to perform a type of quasi- 
experiment with a model of the system through the use of com- 
puter simulation techniques [26]. In our example, if experi- 
mentation were impossible, the response could be simulated by 
varying either, or both the 6 and X. 

The additive model (3) which we have proposed is altogether 
too simple to be treated as a simulation model. That is, it 
could probably be solved by straightforward analytical tech- 


39 


40 J.S. Hunter and T.H. Naylor 


I ——————————————————— 


niques and would not require the use of numerical or simulation 
techniques. To make the model more realistic we might add on 
a random variable e and rewrite the model as 


(4) Y= EX, ore 

where the probability density function of e€ is given by £(e,11) 
and ji represents the parameters of the distri but lon ne saOmane 
troduce further realism (and complexity), transformations g(Y) 
and h(X; ) on the response or On one or more of the X could be 
included in the model. Some of these transformations might 
involve nonlinearities as well as additional parameters. Ad- 
ditional stochastic variables y., each entering with its own 
weight 8. and each with its own distribution and parameters 
V(y5 9 5)> could also be introduced into the model. A time 
dependence denoted by the subscript t could also be employed. 
In general, the model would then be specified as follows: 


k m x 
(5) g(Y,) = Bn 4) + Bo ie ae +8 £(egee 


Dummy variables 6 consisting of ones and zeros, could be 


Sees 
used to LAs 2 Bl presence or absence of certain variables 
at certain times, and to identify blocks of variables that are 
used together. Constraints may also be imposed on the variables 
and parameters of the model. Dynamic feedback mechanisms may 
be built into the model by letting i be a function of Yuen 
Nie ten Finally, we might modify the additivity as- 
pects of the model or introduce other response variables, thus 
converting the model into a multiple response model. Clearly, 
at this point we would have a model which cannot be solved by 
analytical techniques. We would then have to resort to simula- 
tion as a mode of analysis. 

In summary, a simulation model is characterized by: (1) many 
variables Nes and their functions; (2) stochastic variables Y 
and « and their distributions; (3) many parameters cy B, and ji; 


(4) many linkages § between elements of the model; (5) non- 


Experimental Designs 4] 





linearities; (6) assorted constraints; and (7) a response (or 
responses) that may or may not have a time path. There is one 
other characteristic of a simulation model that should be men- 


tioned: a computer is usually an essential adjunct. 


Experimental Design 


The objective of any experimental investigation is to learn 
more about the system being investigated. As George Box [8] 
has pointed out the aim of the experiment might be to explore 
and describe the response surface over some region of interest 
in the factor space, or it might be to optimize the response 
over some operability region in the factor space. In either 
case the basic feature of the experiment is the investigation 
of the response surface using observations of the response at 
various factor levels as data. 

Associated with each of the aforementioned experimental ob- 
jectives is a set of experimental designs. These designs have 
been created to provide, not only economy in the required num- 
ber of experimental trials, but such additional qualities as 
Minimum variance estimates, measures of the adequacy of the 
models, desirable confounding patterns, and ease of computation. 

Since a computer simulation experiment is indeed an experi- 
ment, careful attention should be given to the problem of ex- 
perimental design. In the following section we describe an 
example simulation model. We then describe two different sim- 
ulation experiments with the model which serve to illustrate 
the use of several different experimental designs including 
full factorial, fractional factorial, rotatable, and response 


surface. 


An Example Model [26] 


Consider an inventory system in which daily demand D and 
production lead time LT are both stochastic variates with known 
probability distributions given by f(D) and g(LT) respectively. 


42 J.S. Hunter and T.H. Naylor 


——————— 


The inventory level is reduced each day by the total demand for 
that day. When the inventory level becomes less than or equal 
to the reorder point ROP, then a production order is issued for 
an “optimum" order quantity EOQ. When a production order is 
filled, the number of units of the product ordered are added to 
the inventory stock. The total cost TC of operating then 
ventory system is the sum of the carrying, set-up, and shortage 
costs. Unit carrying, set-up, and shortage costs are given by 
Cl, C2, and C3 respectively. The response surface for this 


relatively simple example is given by 
(6) TC = TC(Cl,G2,C3,£O) 2 (ul), 200 ROE) 


The parameters Cl, C2, and C3 and the density function for de- 
mand are assumed to be given by market conditions and not sub - 
ject to the control of the decision maker. The density func- 
tion g(LT) is fixed in the short-run by technology. EOQ and ROP 
are the only controllable decision variables (factors). 

We are interested in determining the effects of the seven 
factors in this model on the response variable TC. We shall 


define the factors or design variables as follows: 


XyeeaGl Xe = E(LT) 
Moa G2 X~ = BOQ 
Xz = C3 Xz ROP 
X, = £(D) 


We run the simulation experiment once for the set of initial 
values of the seven factors. Next we vary the seven factors 
and observe (through additional simulation experiments) the 
effects on TC of different levels of each factor. Suppose that 
we consider two different levels of each factor. We can then 
use + and - signs to identify the two levels of each factor. 
For example, if the initial value of Cl was 100, we might wish 
to investigate Xy equal to C1+é6 where 6 = 25. The effect of 
the ith factor upon TC is estimated by TC,-TC_ where the Tus 
are the average responses observed when X, Tsiatetits mandy = 
levels respectively. If the stochastic variates D and LT had 


Experimental Designs 43 





not been included in the model, the most important factor (over 
the ranges studied) could be quickly identified by the X; with 
the Largest effect. The presence of the stochastic variates D 
and LT requires that we replicate each simulation experiment 
several times. The average TC for each experiment will then 

be used as observations to determine the effects of different 
factors on the response variable. 

In the following sections we describe two different types 
of simulation experiments with our example model - (1) an ex- 
ploratory model and (2) an optimization model. In each case 
we describe several alternative experimental designs which 


may be appropriate for a given experiment. 


Exploratory Experiments 


Suppose that with our inventory example we are interested 
in exploring the relationship between our seven factors (Cl, 
C2 Goi. ED), ECT), EOQs) and) ROP) “and the response vaxruabile 
TC. If this is the case, then several experimental designs 
are available for conducting exploratory simulation experiments 
with our model so as to gain some insight into the underlying 
mechanisms of the inventory system. We shall consider four 
different designs - full factorial designs, fractional factor- 


ial designs, rotatable designs, and response surface designs. 


Full Factorial Designs 


A full factorial design for our seven-factor experiment 
would involve selecting several values or levels for each of 
the seven factors in the experiment. By assigning to each fac- 
tor one Of ats Levels: we penerate a desaign’ point. If all the 
design points obtainable in this way are used, we would have 
a (full) factorial design. Factorial designs [17, pp. 335-354] 
attempt to cover the relevant range of a factor by a series of 
uniformly spaced values. 


The great advantage of [factorial designs] is [their] 
ability to map the entire response surface of systems with 


44 J.S. Hunter and T.H. Naylor 


i 


a small number of [factors]. The effectiveness of the 
method depends to a significant degree on the nature of the 
response surface. The gentler the slopes and the rounder 
the peaks and ridges, the more exactly does a [factorial de- 
sign] of a given size portray the surface and approximate 
wes Huehes tepolmts | )25) Dimou Olle 


The total number of design points in the full factorial de- 
sign is the product of the numbers of levels for each factor. 
If we have k factors with n values each, then the number of 
design points required is 7. In our example model which con- 
tains seven factors, if we considered only two levels of each 
Falcom, 2 128 design points would be required for a full 
factorial design. 

Suppose that each of the 128 design points consisted of a 
simulation run corresponding to 180 days of simulated experi- 
ence with the inventory system. A conservative estimate of the 
amount of computer time required for a 180-day run might be 
about 15 seconds. (This estimate of computer time might very 
well be much higher if the probability density functions of 
demand and lead time are very complex.) If each of the 128 
runs is replicated, say 30 times, to reduce the effects of 
random error, then a full factorial experiment would require 
16 hours of computer time. 

It is clear that the full factorial design cansnequaneman 
unmanageably large number of design points if more than a very 


few factors are to be investigated. 


Fractional Factorial Designs 


Since a full factorial design for our example model leads to 
excessive amounts of computer time, we would like to find 
another design which requires fewer design points but which 
does not cause us to forego a great deal of information about 
the nature of the response function we are exploring with our 
simulation experiments. Fractional factorial designs enable 
us to accomplish this objective. 

If we are willing to settle for a less than complete in- 


vestigation, perhaps including main effects and two-factor in- 


Experimental Designs 45 





teractions, and excluding three-factor or higher-order interac- 
tion effects, then there are designs which will accomplish our 
purpose and which require fewer trials than the full factorial. 
Fractional factorial designs, which include Latin square and 
Greco-Latin square designs as special cases, are examples of 
designs which require only a fraction of the trials required 
by the full factorial design. 

The major use of the fractional factorial designs is for 
screening, that is, for identifying the most important variable 
influencing a response. They have some times been called 
"equal opportunity" designs since they provide individual es- 
timates of the main effect (and two factor interactions if 
required) of all the computing variables with equal, and maxi- 
mum precision. 

In any design which utilizes fewer trials than the full fac- 
torial there will be some confounding of effects. A main effect, 
for example, will be confounding with one or more high order 
interaction effects, that is, the statistic which measures a 
main effect will be identical to the statistic which measures 
Certain of the interaction effects. Thus, the statistic in 
question may tell us that some effect is present, but it cannot 
tell us whether the main effect, the interaction effect, or 
some additive combination of the effect is present. Only if 
the interaction effect can be assumed to be zero (or at least 
negligibly small) are we justified in stating that the observed 
effect does in fact estimate the main effect. 

Of course, every design provides confounded (biased) esti- 
mates. For example, quadratic and cubic effects, if present, 
confound the estimates of the mean and main effects respectively 
whenever a two level factorial design is employed. Trends, and 
Poisson effects confound estimates. Any phenomenon omitted from 
a fitted model will confound certain estimated parameters in 
the model regardless of the design used. Good fractional factor- 
ial designs are carefully arranged so that estimates of the ef- 
fects thought to be important are confounded by effects thought 
to be unimportant. 


46 J.S. Hunter and T.H. Naylor 


Since experimenters are usually most interested in main ef- 
fects, it is esséntial that main effects not be confounded 
with other main effects. In practically all of the commonly 
used fractional factorial designs main effects are confounded 
with high order interactions. Thus if an experimenter uses 
one of these designs to measure main effects, he must be will- 
ing to assume, at least tentatively, that the interactions with 
which the main effects are confounded are zero or quite small. 
Few experimenters are deterred from the use of fractional fac- 
torial designs by the necessity of such assumptions about high 
order effects. 

Much has been written on confounding in fractional factorial 
designs both in books and in articles in the professional 
journals. Tables of some of the available designs can be found 
in the book by Cochran and Cox [17] and in a publication in the 
Applied Mathematics Series of the National Bureau of Standards 
20) < 

Table la contains a fractional factorial design for our two- 
level, seven-factor inventory simulation experiment which con- 
tains only 8 design points rather than the 128 required in a 
full factorial design. This design is called a 2a fractional 
factorial design (the one-sixteenth replication of a a factor- 
ial design of resolution III). The fraction given in Table la 
is the smallest that can be chosen so that estimates of the 
main effects of the variables are mutually orthogonal. 

Table la Table 1b 
The ae Fractional Factorial Design The Complementary Fraction 
XG NG rte OX 


Xy WC Oh dE 


Ne e2 ac gig ar aa Oma 2. 3; S74! ag ea 
SS eg) eis) eS ea ea +d ee eee 
= Ss = oS et = oe GR ee 
=i ke a) er ey Be oe" et a 
oe ee ga = 0 So She i 
- =- + + =5= = + ee ee 
ee ee ae 4 a Oy ee eee 
=) he ES + = = Fe =F 


Experimental Designs 47 





If more information is needed, an additional eight-run oer 


fractional factorial can be employed and the data from all runs 
analyzed. By combining fractions opportunity exists to uncover 
information on various interactions between the factors. Addi- 
tional fractions can be added leading ultimately to the complete 
oe design. However, the salient roles of the seven variables 
would usually be uncovered long before the full a! = IS) Exgasie= 
iments were completed. One valuable second fraction is the 
complementary, or fold-over fraction, in which each run is the 
opposite of an earlier run as illustrated in Table 1b. Further 
advantages occur if the entire sixteen runs are planned ini- 


tially and run in eight blocks of complementary pairs, such as 


ee, ee EN 


+++ - - - + 


Many other blocking arrangements are possible. 

If "interactions" between the factors upon the response are 
anticipated, then estimates of these effects in addition to 
the main effects can be obtained by using a fractional factor- 
ial design of higher “resolution.'' The cost to the experimenter 
is more computer runs. A discussion of the resolution of frac- 
tional factorial designs, along with methods for constructing 
and analyzing the designs is given in [14]. 

If for some reason we find it necessary to investigate more 
than two levels of certain factors, then the 2k-P fractional 
factorial designs can be quickly adapted to provide mixed level 
Eroactlonalekactorials. Lo milustrate, consider the column vec- 
tors X_ and X> in the design of Table la. Considered together, 
they provide four patterns of signs (--), (+-) and (-+) and (++). 
Let each pattern be associated with four levels (-3,-1,1,3) of 
a new variable XG. The resulting mixed level 2 x 4 fractional 
FaAGtorlale Ls sdaspiltayveds am Mable Za.) it each) pattern as) associ— 
ated with one of the three levels (-1,0,1) of a new variable 
XG we obtain the ea 5 fractLonalsidiusplayed ani lable Zp). 

These are merely two examples of the very large number of 


mixed level factorial designs of very desirable estimation 


48 J.S. Hunter and T.H. Naylor 


qualities discussed in [1]. An important group of three-level 
fractional factorial designs is discussed in [10]. 


Table 2a Table 2b 

Fractional 4 x ze Fractional 3 x ae 
Xa X, X, Xy X¢ X6 Xy X, X3 Xy Xe X¢ 
- - - + + -l - - - + + 
+ - - - - 3 + - - - + 
= + - - + il - + - + - 
+ + - + - -3 + + - + - - 
- - + + - 1 - - + + - 
+ - + - + -3 + - + - - - 
- + + - - -l1 - + + - - 0 
+ + + + + 3 + + + + + aL 


Rotatable Designs 


If our mode of data analysis with our example model consists 
merely of fitting a first order regression equation to the data 
generated by the simulation experiment, then a two-level full 
factorial design or a fractional factorial design will provide 
sufficient precision to estimate the coefficients of the re- 
gression equation. (Recall that we have ruled out the full 
factorial design on the basis of computation time requirements.) 
However, if we fit a second-order polynomial (or higher order 
polynomial) to our output data, then a fractional factorial 
design may lead to parameter estimates of the coefficients of 
the squared terms which have relatively low precision. Since 
a second-order polynomial in seven variables has 36 coeffic- 
ents, the ge fractional factorial design which we previously 
described would hardly be adequate. However, we may be able 
to find a rotatable design which requires fewer than the 128 
design points required for a a factorial design but more de- 


7 factorial 


sign points than the one-sixteenth replicate of a 2 
design of resolution III. 


Rotatable designs were developed specifically for fitting 


Experimental Designs 49 
ac IEE wi 
second (and higher) order polynomials to output data. These 
designs exist for all values of k (the number of factors) and 
can be constructed by combining together the vertices of the 
regular or semi-regular geometric figures plus center points. 
Rotatable designs guarantee that the standard deviation of the 
fitted response at any point in the factor space depends only 

on the distance of the point from the center and not on its 
direction. The design points can therefore be rotated about 
the center without changing the variance of the predicted re- 
sponse at any point on the fitted response surface. Rotatable 
designs require quantitative factors whereas factorial designs 
MeedemlOit. 

A fairly simple design is the cube plus star plus center 
points. If we take the origin as the center of the design, 
the cube portion will be a two level full factorial with each 
factor at the levels -a and ta. Writing k for the number of 
factors, we can designate the ox design points in the cube by 
(+a,ta,...,ta) where it is understood that each combination of 
plus signs and minus signs yield one of the points. The star 
portion consists of the 2k points (+b,0,...,0), (0, Oh vererers Os 

(Om se .0atb) the Centex points consist of Ng runs at 
tiemcentenmoL wthemdesmon en mnthe mom loume (0) Olsens Oren uine 

DS 


condition for rotatability is that b/a = In our example 


experiment we might use a fractional factorial in place of the 
i 
2 


and Herzberg [19] have described a second order rotatable de- 


full factorial for the cube portion of the design. Draper 


sign for seven factors which requires 108 design points which 
may be suitable for our inventory model experiment. 

Detailed information on rotatable designs can be found in 
chapter 8A of Cochran and Cox [17] and in numerous journal 
atelCHe Sms Or or Olle Zr Sol Shel Sie eAUS tam dopieartsc aS 0m nas 
utilized rotatable designs in his simulation experiments with 
marketing models. 


Response Surface Designs 


Suppose that in our example inventory model the parameters 


50 J.S. Hunter and T.H. Naylor 
Cl, C2, C3, E(D), and E(LT) are fixed and known and that we 
want to explore the relationship between our two decision var- 
iables EOQ and ROP and the response variable TC. We must find 
a response function which is a suitable approximation to the 


true response surface . 
(7) TC = TC(EOQ,ROP) 


The selection of approximating functional relationships has 
more often than not been from those which are linear in param- 
eters by virtue of the convenience of linear regression tech- 
niques. With this constraint the choice lies essentially be- 
tween polynomials of higher and higher orders. For each order 
of the chosen approximation functions different types of ex- 
perimental designs can be proposed. 

Returning to our inventory model and using the notation 
which was introduced at the beginning of the paper, we might 
initially try to fit a first-order polynomial to the data gen- 
erated by our simulation experiment as a first approximation 


of the true response function, 


(8) Y= 89 + 06X47 cf 05X5. 


A good design is the full ne factorial, repeated as illustrated 
by the filled in dots in Figure la. If the first-order model 
proves to be inadequate to represent the response over the 
ranges of X6 and Xo under study, then the design can be aug- 
mented by additional trials in X¢ and Xo, as illustrated by 

the open dots in Figure la, and an empirical second-order poly- 


nomial fitted to the response, 


+ + 


2 
0 * Seka t, Saka i a8ee Gat Cra anon aad 


Data from a series of runs, the fitted second-order model, and 


(9) Y = 6 


the resulting contours of the empirical response surface are 

shown in Figure la and 1b. The argument is not that the map of 
the system response provided by the design information exactly 
reproduces the actual response that could be determined at any 


point in the region of interest of X¢ and Xo. However, since 


Experimental Designs Sl 








Octagon Design 


Figure la 





Fitted Second Order Model 
Pigure ib 


= ae = 3.09X2 = 4.88X XxX, 


NOH eyo. Cumin OIG IN 7 


sea op SIODKG 


6 7 


52 J.S. Hunter and T.H. Naylor 





it is usually true that the system response changes smoothly 
as variables such as X¢ and X, are varied, the picture is a 
representation of the response function and is, in fact, worth 
the proverbial thousand words. And obviously, the amount of 
experimentation (the number of times the full simulation model 
had to be exercised) to provide this map is close to minimal. 

What has been illustrated here is called response surface 
methodology (RSM) by experimental statisticians. First- and 
second-order models and associated designs are available for 
k variables. Excellent discussions, with examples, of response 
surface methods can be found in [7] and [16]. Further infor- 
mation on the experimental designs is given in [12,13, and 18]. 
The use of multi-dimensional maps of each of several response 
functions can often uncover, to the eye, relationships between 
responses that would defy all but the most exhaustibe detailed 
investigations. Further, through the use of canonical analysis 
of fitted second-order models [7] the redundancy of variables 
often becomes apparent as, for example, when some single can- 
onical variable can replace two or more controlled variables. 
Furthermore, with canonical analysis we can ascertain the gen- 
eral shape of the response surface. Another important aspect 
of many response surface designs is that they may be blocked 
into subsets of runs, and performed sequentially, the addi- 
tional blocks being employed only when required. 

If we had permitted all seven factors in our response func- 
tion to vary rather than holding five of them constant, we might 
have found it necessary to resort to a fractional factorial or 
rotatable design instead of a full factorial design to fit the 


second order polynomial. 


Other Designs 


It should come now as no surprise to the reader that a great 
many other experimental designs exist for exploratory experi- 
ments. Latin square, split plot, hierarchal, chain block, 
Youden squares, switch-back, simplex, and balanced block de - 


signs all offer the opportunity for systematic and economical 


Experimental Designs 53 


study of a response function. The literature associated with 
these designs is very extensive [4,17,18,22]. 

Once we have discovered which of the variables are most 
important, and have described how these variables influence 
the response by means of our approximating empirical maps, we 
can next turn to answer the question why things appear as they 
do. Here we are likely to postulate nonpolynomial, nonlinear 
models, and perhaps different competing models illustrative of 
competing theories. The construction of experimental designs 
for the problems of nonlinear models, and model discrimination, 
is undergoing rapid development. Interesting references are 
ieSilimands (242 


Optimization Experiments 


If, on the other hand, our experimental objective were to 
find those levels of EOQ and ROP which minimize TC, then we 
should investigate the use of optimum seeking methods. 

The methods of steepest descent (or ascent in the case of 
Maximization) is one of several major optimum seeking methods. 
We require a design point in the factor space as a starting 
value for the procedure. As a first step we need a linear 
approximation, i.e., an approximating hyperplane, to the re- 
sponse surface in the neighborhood of the starting value. Un- 
less the response surface is a known function we must explore 
the neighborhood of the starting value in order to fit an ap- 
proximating hyperplane. A simplex, a fractional factorial, or 
a first-order rotatable design might be used to obtain a linear 
fit by least squares. 

The next step is to explore along the direction of steepest 
descent as determined by the approximating hyperplane. Design 
points would be chosen along this direction until a point is 
reached at which no further progress seems likely. At this 
point a new local exploration is performed with the possible 
result that a new direction of steepest descent will be ob- 


tained. 


54 J.S. Hunter and T.Hu Naylor 





After several cycles of this process we are apt to reach a 
point at which the local linear fit is a nearly horizontal 
hyperplane with no direction of steepest descent. This will 
be the case if we have reached the bottom of a valley or hill 
or where the fitted hyperplane becomes inadequate to represent 
the response. It can also happen if we have reached a saddle 
point which is a minimum in some directions but a maximum in 
others. As a final step, therefore, we would explore the neigh- 
borhood of the apparent minimum more thoroughly by fitting a 
local quadratic approximation to the response surface. This 
could be done using a second-order rotatable design or perhaps 
a three-level fractional factorial design. If the approximat- 
ing quadratic is positive definite, then we may have reached 
the minimum of the response surface. 

One practical question that arises with the method of steep- 
est descent that may be of some concern to management scientists 
is the step-size to take as we move in the direction of steep- 
est descent. The trade-offs involve computer time and the risk 
of overshooting the mark and missing the minimum point on the 
response surface. If we choose a relatively small step-size, 
enormous amounts of computer time may be required to converge 
on an optimum. On the other hand, a larger step-size may re- 
duce computation costs but increase our chances of overshooting 
the optimum. 

The single-factor method is another important optimum seek- 
ing method. In this method the level of a single-factor is 
varied while the levels of all other factors are held constant. 
When no further improvement is possible, the single factor is 
held fixed at its best level and another factor is chosen to 
be varied. After this single-factor search is performed for 
each factor in the experiment, the cycle begins again by vary- 
ing the original factor once more. The process ends when no 
change in the level of any single factor leads to a decrease in 
the level of the response. 

The single-factor method has at least two advantages when 


Experimental Designs 55 


compared to the method of steepest descent. One is that it does 
not require local exploration at intermediate stages of the pro- 
cess. Since exploratory designs require many points, this could 
be a substantial saving. A second advantage for the single-fac- 
tor method occurs when it is markedly easier to vary one factor 
at a time than it is to vary several factors simultaneously. 
This might be the case for example in an experiment using some 
engineering device (an analog computer, perhaps) where factor 
levels correspond to settings on a dial. 

The single-factor method also has an important disadvantage 
when compared with the method of steepest descent. The search 
can move only in directions which are parallel to the coordin- 
ate axes. As a result it is possible for the search to terminat 
on a ridge not parallel to any axis. This difficulty can be 
overcome by exploring the neighborhood of the apparent minimum 
by response surface methods. 

Other optimum seeking methods exist, some of which combine 
the two approaches given above with methods for changing both 
the size of the design and the single factor steps. 


Summary 


Successful learning requires: (1) the full use of prior 
knowledge in proposing useful models, and (2) good experimental 
strategies for gathering evidence useful for synthesis and con- 
jecture. Our present experimental designs, and their methods 
of analysis, meet these requirements well. The mathematical 
models associated with the designs are very rich in the number 
of alternatives which are possible. The techniques for expos- 
ing the information in the data provided by the designs are 
well organized and usually easy to perform. Experimental de- 
Signs and their analyses thus directly enhance the remarkable 
acts of synthesis of new knowledge, and conjecture of new ideas, 
that every experimenter covets. 


To review, a simulation model is constructed in the hope that 


56 J.S. Hunter and T.H. Naylor 





it will successfully mimic a real world system. The model may, 
as a consequence, become very complicated and involved. How- 
ever, our understanding will proceed only when we are able to 
synthesize the system in terms of simpler explanations. This 
synthesis will require initially the identification of the major 
variables affecting the responses thus bringing us to the pro- 
blem of screening experimentation. Once the important variables 
are identified, we must next evolve terse empirical relation- 
ships associating the leading variables with the responses. In 
essence, simple empirical models are super-imposed on the larger 
detailed simulation model. The simpler models are useful for 
drawing general maps, for making broad inferences, and for ident- 
ifying areas where more detailed models are required. Once an 
empirical understanding has been acquired, it becomes possible 
to postulate general laws and theories that may be applicable 
not only for the particular system problem under study, but for 
other similar systems. Of course, all that has been described 
is part and parcel of what is usually called "the scientific 
method.'' The use of experimental designs serves to make this 


learning process as economical in time and in resources as pos- 


Saber 
Bibliography 
1. Adelman, S. "Orthogonal Main-Effect Plans for Asymmetrical 


Factorial Experiments," Technometrics, IV (1962). 


2. Bonini, Charles P. Simulation of Information and Decision 
Systems in the Firm. Englewood Clifes, Nedeeenenticeehalal: 
Ince Loosr 


3. Bose, R.C., and Carter, R.L. “Complete Representation in 
the Construction of Rotatable Designs," Annals of Mathe- 
Maleac allen Site aite teste Siem exOen (LOO) ie 


4. Bose, R.C., Clathworthy, W.H., and Shrinkhande, S.S. Tables 
of Partially Balanced Designs with Two Associate Classes. 
North Carolina Agricultural Experiment Station Technical 
Bulletin, Number 107, (1954). 


5. Bose, R.C., and Draper, N.R. "Second Order Rotatable De- 
signs in Three Dimensions," Annals of Mathematical Stat- 
istics, XXX (1959). 


Experimental Designs 57 





Os 


10. 


i, 


Ue 


Sie 


14. 


ILSys 


GK 


Wiis 


Is} 


OK 


20. 


“lke 


Bon miGrbebe eaMuilieiEac Con sDesipns ion Biase Order," Bao- 
Melee Kaew RON (LG'S) 29) 49) Si7 
Box, G.E.P. '"'The Exploration and Exploitation of Response 


Surfaces: Some General Considerations and Examples," 
Biometrics, X (1954), 16-60. 


Box, G.E.P. “Use of Statistical Methods in the Elucida- 
tion of Basic Mechanisms,'' Biometrics, XIII (1957). 


Box w Gabe ands Sehnkin. DW.  “Samplex—-sum Designs: A 
Class of Second Order Rotatable Designs Derivable from 
Those of First Order,' Annals of Mathematical Statistics, 
MXX1) (960) 


Box Grabeel andebehnkin DEN | (comes New whnee Weved: Ne- 
signs for the Study of Quantitative Variables," Techno- 
Mecwucs, sll. (19 G0) 4554747 


BOS GhEn> mands Diaper Nhu BasdsetOr EhenSecect ron: 
of a Response Surface Design," Journal of the American 


Statistical Association, LIV (1959), 622-654. 


Box (Gabel and Draper, NeRs Chorvece of Second Order 
Rotatable Designs," Biometrika, L (1963), 335-352. 


BOR GeEae ap ands Hunter. Jo.  iMulti-factor Experimental 
Designs for Exploring Response Surfaces," Annals of 
Mathenatrcale Stacistr1cs, Xxvlll (957) > 195-245 


Box, G.E.P., and Hunter, J.S. "The 2K-p Fractional Factor- 
Lawes ensmbarted and sills hechnometrics, LEE ilo 6): 


Box, G.knP., and Hunter, William GG: "The Experimental 
Study of Pysical Mechanisms,'' Technometrics, VII (1965), 
23-42. 


BOXGe Geka 5) and Wilson, K.B. "On the Experimental Attain-— 
ment of Optimum Conditions,'' Journal of the Royal Sta- 


Erstical Society B, XIIT (1951), 1-45). 


Cochran, W.G., and Cox, G.M. Experimental Designs. New 
York: John Wiley §& Sons, 1957. 


Davies, O.L., ed. Design and Analysis of Industrial Exper- 
iments. New York: Hafner Publishing Co., 1960. 


Draper, Norman R., and Herzberg, Agnes M. ''Further Second 
Order Rotatable Designs,"' Annals of Mathematical Sta- 
tistics, XXXIX (Dec., 1968), 1995-2001. 


"Fractional Factorial Designs for Factors at Two and Three 
Levels," U.S. Department of Commerce, National Bureau of 


Standards, Applied Mathematics Series 58, U.S. Government 
Printing Offices WashinpconeDACkmCSepte L., 1961). 

Hammersley, J.M., and Handscomb, D.C. Monte Carlo Methods. 
New York: John Wiley & Sons, 1964. 


58 


Bei. 


Zioke 


24. 


Zid 


ZO 


ile 


28. 


ZO 


30. 


ole 


J.S. Hunter and T.H. Naylor 


Herzberg, A.M., and Cox, D.R. "Recent Work on the Design 
of Experiments: A Bibliography and a Review," Journal of 


the Royal Statistical Society A, CXXXII (1969). 


Hill, William J., and Hunter, William G. "A Review of 
Response Surface Methodology: A Literature Survey," 
Technometrics, VIII (Nov., 1966), 571-590. 


Hill, W.J., Hunter, W.G., and Wichern, Di W. VASvomme De-q 
Sign Criterion for the Dual Problem of Model Discrimina- 
tion and Parameter Estimation,'' Technometrics, X (1958). 


Hufschmidt, M.M. “Analysis of Simulation: Examination of 


Response Surface,'' Design of Water-Resource Systems, 
Arthur Maass, et al. teds-) Cambridge: Harvard Univer- 
sity Press, 1966. 


Naylor, Thomas H., Balintfy, Joseph L., Burdick, Donald S., 


and Chu, Kong. Computer Simulation Techniques. New York: 
John Wiley §& Sons, 1966. 


Naylor, Thomas H., and Burdick, Donald S. "Response Sur- 
face Methods in Economics,'' Review of the International 
Statistical Institute, XXXVII (1969): 


Naylor, Thomas H., Burdick, Donald S., and Sasser, W. Earl. 
"Computer Simulation Experiments with Economic Systems: 
The Problem of Experimental Design," Journal of the 
American Statistical Association, LXII (ec., 2907), 

SES Sse 


Plackett, R.L., and Burman, J.P. "The Design of Optamum 
Multifactorial Experiments," Biometrika, XXXIII (1946), 
SOS S52 50 


Preston, Lee E., and Colilins, Norman R. Studitesmamed 
Simulated Market. Research Program in Marketing, Graduate 
School of Business Administration, University of Cali- 
fornia, Berkeley, 1966. 


Spendley, W., Hezt, G.R., and Himsworth, F.R. "Sequential 
Application of Simplex Designs in Optimization and Evolu- 
tionary Operation," Technometrics, IV (1962). 


Factor Selection 


John L. Overholt, Center for Naval Analyses 


Introduction 


The proper selection of factors is often difficult in ex- 
periments, and seems to be more difficult in computer simula- 
tions than in the laboratory or plant. The number of possible 
factors may be very large, larger than in the laboratory, and 
simulation usually is not used unless the situation is com- 
plex. The difficulties will be discussed in more detail la- 
ter. The problem is to select those factors that are most 
important, so that a critical set of runs can be made in a 
reasonable time. In a budget submission, for example, it is 
impossible to examine all ramifications but one must prepare 
the best answer possible by a fixed date and refine the an- 
swer in subsequent cycles. The problem is compounded when 
Many groups are interested in the outcome but each has its 
own facet of responsibility. Lastly, computer simulation ex- 
periments indicate no residual error due to significant omit- 
ted factors. Thus, there is no clue from a large residual 
error to alert one to the presence of omitted but important 
factors. Instead there may be binomial errors from Monte Car- 
lo simulations which may or may not be typical of the "real 
world." Naylor, Burdick and Sasser [8] state that very little 
attention has been paid in print to experimental design tech- 


niques relevant to computer simulations. 


Literature on Factor Selection 


Before discussing the selection of variables in face of 


the above problems, the advice given in selection of factors 


60 John L. Overholt 





by standard texts on experimentation and experimental design 
will be discussed. E.B. Wilson [9] in discussing experiment- 
ation in general, advises one to decide in advance just what 
is being tested, recognizing that purely exploratory experi- 
ments are necessary in a new field. At the origin of the pro- 
blem and at every stage of planning one should ask, "Why am 
I doing this? Will it tell me what I want to know?" An at- 
tempt should next be made to devise the simplest crucial ex- 


periment that will answer the question or questions. 


Factors 


All science rests on the idea that similar events occur 
under similar circumstances. The identity of particular e- 
vents can often be described by fixing a rather small number 
of factors. The essential characteristics which fix the oc- 
currence of a given event are choice of values of the inde- 
pendent variables or factors or treatments or parameters, 
which are more or less synonomous terms, depending on the 
statistical text used. 

For review, the terms to be used will be discussed briefly. 
Factors are input variables or the independent variables 
which can be set by the experimenter, and responses (some- 
times "yields" or Y's) are the results (dependent variables) 
obtained in the experiment/simulation and depend upon the 
"settings" or "levels" of the factors used in the experiment. 
Thus, the purpose of this discussion is the selection of the 
factors (X's) for the general matrix equation Y = XB + error, 
where the coefficients (B's) are determined from the experi- 
ments or simulations. In some cases the term ''factor" is 
also applied to the dependent variables, the Y's. 

There are several categories of factors: 


1. continuous (quantitative) 
- distribution known 
- distribution unknown 


or discrete (qualitative) 
2. interdependent 


Factor Selection 61 





- nested 
- interactions 


3. observed or unobserved 
4. exogenous and endogenous 


5. in additzon to the controllable factors, uncontrol- 
lable causes for variation are usually present. 


These categories are not mutually exclusive so that some 
terms are nearly synonomous while others are poles apart. 

Some distinctions are necessary in selecting appropriate de- 
signs. The important ones are continuous or discrete, and 
interdependent. Examples of continuous factors are tempera- 
tures, amount of reagents, distances, time, probabilities, 
percentages and the annual/gross national products. Some dis- 
crete factors are varieties of corn, tire brands, or organi- 
zations which are to be compared in an experiment. 

Factors may be ''nested'"' and interrelated. For example, 
nested variables may be months, weeks, days and shifts (with- 
in a day), or a tin can can be cut from a certain position 
(center, side or end) of a tin plate coil and the coils may 
have been from the same or different ingots of steel made 
from different batches. Interactions are non-additive effects 
produced by two or more factors. 

Economic papers such as Naylor [8] and Bonini [1] use the 
terms endogenous variables for the factors which are control- 
lable within a business enterprise and exogenous for those 
which are external to the firm, while others use the two terms 
to designate the independent and dependent factors. Factors 
in simulations may be held constant or varied. Other factors 
such as fires or earthquakes may be uncontrollable but can be 
observed and taken into account in the analysis. Recognition 
of the type of factor is necessary for good design and anal- 


ysis. 


Proposed Selection Methods 


Cochran and Cox [2] have proposed that a draft be written 


describing the proposed experiment giving: (1) a statement 


62 John L. Overholt 





of the objectives; (2) a description of the experiment and 
experimental material; and (3) an outline of the method of 
analysis before the experiment is started. Kempthorne [7] 
and others have specified the formulation of a hypothesis as 
the second stage after stating the problem. 

Bonini [1] proposed four criteria for selecting factors to 
be tested in a computer simulation of a model of a business 
firm's activity: 


1. Preliminary tests to establish the reasonableness 
and stability of the model. 

2. Changes should be in the external conditions not 
under the control of the firm and the aggregation of fac- 
tors used to simulate the systems in the model. 

3. Test the most crucial parameter first--selected 
largely by intuition. 

4. Conduct experiments with factors in existing hypoth- 
eses (in economics, accounting, or the behavioral sci- 
ences). Presumably new hypotheses would also be formu- 
lated and tested. 


C. Daniel [4] proposed reviewing history very thoroughly 
by: (1) constructing an influence matrix containing what you 
think you know and do not know about the effect of factors on 
responses that you can measure; (2) analyze all data avail- 
able by simultaneous least squares; and (3) on the basis of 
the second step, revise the influence matrix and plan the 
first step Onn LactOmesiereenhimigy, 

Daniel's "influence matrix'' contains information about the 
proposed factors: the range over which the factor is to be 
varied, and information on the expected responses. That is, 
"What is the response to a change in a given variable?" For 
example, if the amount of soap used is the variable, is the 
amount of suds proportional to the amount of soap needed or 
does a little bit of soap make all the suds needed so that an 
excess is superfluous or even detrimental? This extention of 
the term factor to include the complexity of the expected re- 
lationship between factors and results is useful in planning 
experiments efficiently. For measuring at several levels to 
obtain non-linear relationships, "pseudo-factors" may be need- 
ed. These are letters having binary meaning which are useful 


Factor Selection 63 





in design preparation and analysis of results. Thus a four 
level variable, Q> Q>> Qz; Qy» can also be described by com- 
binations of A and B: AB, AB, AB, and AB. 

im) the antluence matrix there) 1s no provision for record- 
INFeInteractions.  inenetore,..a useful \extensaon ‘of the in- 
fluence matrix when considering possible responses with many 
factors, is a two-way figure reviewing existing information 
on possible interactions. A computer simulation with many 
variables might be expected to have many interactions but 
this may or may not be true. It was surprising to me to find 
that many factors were independent (non-interacting) in ex- 
ample (4) (to be discussed), upon considering what we knew 
about each individual pair. For example, the detection of a 
submarine is independent of the destruction of an aircraft by 
a missile fined from another aircraft. The likely existence 
of the interactions can be indicated as in Figure 1, where 
the main effects are represented by the shaded squares. A 


separate figure is required for each response. 


Factor 


Main 
Effects A 


Factor 











PigwieSs Ib, MiMieSiealSiesOyl wel 


64 John L. Overholt 





D.R. Cox [3] has suggested five classes of factors: 
1. those of direct interest 


2. those that modify the action of main factors or 
may throw light on how the main factors work 


3. those connected with experimental technique 


4. potentially important groups because of physically 
important differences (age and sex difference of 
patients) 


5. deliberately inserted variations of experimental 
units, designed to examine interactions and extend 
the range of validity of the conclusion concerning 
the main factors. 


Other Factor Selection Criteria 


Several other subjects which are not discussed in most ex- 
perimental design texts, are worth considering in computer 


simulations. 


What are the Objectives of the Sponsor? 


Because computer simulations can treat most or all of an 
organization, many functional groups in the organization will 
be interested in the results. The needs of each branch of 
the organization may be different, although some overlap. 
Each branch may be interested in a different set of variables, 
so the first task is to discover these interests usually by 
careful attention in conference. In a military organization, 
the highest levels of command are concerned with future pol- 
icies and the best means of implementing them. A sub-group 
is responsible for choosing new equipment needed to execute 
policy decisions, taking cost and effectiveness into account. 
Other groups build and test the equipment. Still others 
train men in the use and maintenance of the equipment and in- 
tegrate many units into an organization. An industrial or 
financial organization which uses computer simulations may 
have allied problems because the heads of manufacturing, 


sales, and research have different aims in the same organiza- 


Factor Selection 65 


tion. No complex simulation can go into the depth required 

to explore all variables at once, although it is frequently 
necessary to examine the effect of changing variables of spec- 
ial interest to other than the sponsor to obtain meaningful 
answers. Therefore, the first task in selecting variables is 
to find out "Who is boss?" and what problem(s) he wants solved 
by listening very carefully and examining whether and how the 
simulation available can get the answers desired. Often the 
problems are not explicitly stated but evolve in consulta- 
tion later when tentative work plans are presented. Usually 
more questions are raised than can be answered adequately in 
the allotted time. 


What Factors are Required? 


When the problem has been correctly stated, it becomes 
clear that certain factors are important. The recognized fac- 
tors that have a bearing on the solution are listed and rank- 
ed in importance. The factors are examined as indicated 
above for their suitability. 


Scope 


A problem should be studied in a wide enough context so 
that some assertion of generality can be made. Although seem- 
ingly repetitious of the two earlier points, suitable answers 
for the sponsor are obtained only by examining the factors in 
breadth. One must take large steps and not spend too much in- 
itial effort on examining small steps. If a factor proves to 
be important, then the finer details are examined. It is dis- 
concerting to have a critic say, "If you had used another sit- 
uation, your conclusion would have been different. Therefore, 


I don't think you have proved your case." 


Are the Variables Compatible with the Simulation? 


Some variables are translated into other terms so they can 


be inserted numerically. Economic models discussed in [1] 


66 John L. Overholt 





and [8] are good examples. In the examples that follow, 
"best'' estimate’ was translated into numerical figures which 
in some cases were the number of units possessed by each side 
and in other cases, the probability of successful operation. 
In still other cases the average and extreme weather effects 
from records in geographical locations could be translated 
into inputs for simulation at a lower level of aggregation 
and the results used as inputs for the larger simulation. 
Considerable ingenuity is required at this stage. When an 
important variable cannot be tested, it is an indication that 
the simulation is not satisfactory and that limited answers 
will be obtained until the simulation is improved. Candor in 


reporting this situation seems wise. 


How is Uncertainty Treated? 


Uncertainty can be a factor as indicated above by using 
"best," “pessimistic,” and “optimistic inputs. seimeommes 
cases where a population can be estimated, random sampling 
from the simulated population for Monte Carlo inputs can fur- 


nish expected values for the factor. 


Hierarchy of Solutions 


In the Center for Naval Analyses we have used several sub- 
routines which were quite complicated simulations for obtain- 
ing inputs for a larger routine. Each of the subroutines was 
treated as a separate problem by exploring the response sur- 
face as a function of a number of variables. Then from fam- 
ilies of fitted curves, appropriate values could be inserted 
into the large scale routine. The alternative would have 
been the inclusion of all of the factors into a very large 
design which would have been unwieldy and required excessive 
computer time. This choice of method in handling variables 
in designs is somewhat similar to conducting series of exper- 
iments in sequence in laboratories, pilot plants, and in full- 
scale operations. Allocating factors to the right hierarchy 


Factor Selection 67 


ne 


is a general problem in laboratory and computer experiments. 


Tentative Designs 


It is useful to construct experimental designs for discus- 
sion purposes solely to determine the feasibility of using 
the proposed variables. Revision is normal and sometimes 
even the final designs are never used, but serve to focus at- 


tention on the important part of the problem. 


Deadlines and Other Constraints 


Deadlines, time to make runs, and the number of replica- 
tions for inherent errors all determine the number of factors 
which can be tested in a given time. A good argument can be 
made for conducting research until nearly all pertinent vari- 
ables are tested and an optimal answer is found. However, in 
the real world, where complex building programs are in pro- 
gress, the best timely answer is desired regardless of opti- 
mality; such answers are subject to review and revision. 
Cyclic experimentation and review is the only insurance--not 


assurance--against omission of important variables. 


An Example of Selecting Factors 


To bring out some of the problems of factor selection, I've 
chosen a hypothetical problem familiar to all--the selection 
of a house. A candidate list of factors to describe all 
houses, might be the location, foundation (sand, rock), size, 
framework (wood, brick, steel, stone, geodesic domes, snow), 
design, protection against external elements, interior fin- 
ish, furnishings (including utilities) and finance. This 
list might be useful for a want-ad or for an architect. Yet 
an important factor is missing, namely, the identity of the 
person who is making the decision about the house--the ulti- 
mate resident. Depending on the person, the level of the 


suitable factors will vary enormously. Consider these occu- 


68 John L. Overholt 


pants: a family of five with a working father/mother; a stu- 
dent; a retired couple; an Eskimo; and an astronaut on a fu- 
ture interplanetary flight. Obviously the occupation of the 
working members will be related to the location because of 
business and the home determine the travel time to work, shop- 
ping, schools, recreation, and churches. Thus, the problem 
under discussion should be restated as the selection of a 
suitable home for the resident. The next task is to deter- 
mine who is making the decision in question: a family head 
deciding to rent, buy, remodel or build, or a real estate 
developer, or a banker who being asked to finance one of the 
alternatives? For each, a somewhat different set of factors 
will be important. The family head must consider the wishes 
of his wife about potential neighbors in selecting the loca- 
tion. The developer is concerned with the economic potential 
of the city, land availability, building costs and future 
transportation. The banker must weigh this investment against 
other possibilities, the money market, and the credit rating 
of the resident. The banker will not be concerned with many 
details of construction of interest to the resident asthe 
construction is sound. This case should be sufficient to 
demonstrate that it is very important to establish and state 


the context for examining the problem. 


Other Examples of Factor Selection 


The examples that follow are taken from study groups in 


the Center for Naval Analyses. 


Example lhe yMSlomRtile 


Although this test was not a computer simulation, it il- 
lustrates the process of factor selection better than most 
cases. Many letters to Congress cited the failures of the 
M-16 rifle, and, after investigation, the Ichord Committee 


recommended an impartial test of the weapon. Although we 


Factor Selection 69 





knew little about such weapons, we were asked to prepare a 
test plan. Our responsibility was to the general public and 
the troops in Vietnam. We listened carefully to Army, Air 
Force and Marine Corps briefings on the history of the devel- 
opment of the rifle, changes that had been made, and field 
experiences, noting each of the alleged causes or situations 
which might lead to failures. This list was compared with 
the Ichord Committee report. The first variable was the in- 
clusion of a comparison rifle, the M-1. Two types of propel- 
lant and two types of ammunition were in use, and each had 
more than one manufacturer. Therefore, each propellant type 
and the major suppliers of each were chosen as additional 
variables. Rifle cleaning and the test environment were dis- 
cussed at length. It was proposed that four oversized pla- 
toons be used and that each man would be assigned one rifle, 
one type of ammunition, and one method of cleaning. Four 
intervals between cleaning were used. The test site had sev- 
eral ranges of widely different characteristics: swamp, land- 
ing beach, dusty upland and jungle. It was proposed that 
each of the four platoons fire on four ranges and then rotate 
to a different range. Auxiliary information was collected on 
causes of failure so that the effect of continuing firing on 
wear and failure rate could be, and was, extracted. 

Several trial experimental designs were prepared and as 
objections were raised to each, the design was discarded or 
revised as the requirements became clearer. The test plan 
was executed under the supervision of the Institute for De- 
fense Analysis who made some refinements of the initial plan. 
This process of dialogue between the experts in the field, 
executors of the test plan and the person preparing the de- 
sign is imperative for a good plan. In this case, as in 
many computer simulations, time was important: the planning 
for this very large field trial was done at the end of Novem- 
ber 1967, tests in Panama started just after New Year's Day 
and the report was published in February 1968. 


70 John L. Overholt 





Example 2. Anti-Submarine Capability 


Computer simulation of submarines attacking fleet forces 
was used to determine the value of adding a new anti-submar- 
ine capability. Repeated Monte Carlo runs gave the probabil- 
ity of sinking the submarines and survival of the defending 
forces. Prior to consideration of an experimental design, 
the comparisons were made with only one variable--the number 
of anti-submarine units present. A number of runs had been 
made, finding that in one case adding more defensive units 
produced poorer results. This contrary finding was due to 
fluctuations from the Monte Carlo results. More generality 
in results was desired. After lengthy discussion and prepar- 
ation of several trial designs, the test design used included 


these variables: 


Levels 
(a) presence or absence of 
the new capability zs 
(b) number of attacking 
enemy submarines 4 
(c) number of defending 
units 4 


The levels were chosen so that linear, quadratic or cubic 
terms could be derived. The arcsine transformation of the 
probabilities was used in the analysis to normalize the out- 
comes, namely, the probability of failure due to the various 


factors. 


Example 3. Mapping Anti-Submarine Capabilities 
The probabilities of detection and of attack for both 


sides were needed for the large simulation described in ex- 
ample 4, because many engagements occurred under many condi- 
tions which could be specified. Evaluation of each engage- 
ment would have been too time consuming but it seemed pos- 
sible to prepare families of curves which could be consulted 
for the conditions required, inserting the appropriate values 


from the curves into the larger simulation. A simulation was 


Factor Selection Hl 





available for obtaining probability of submarine detections, 
average range of target, and detection time from the start of 
the simulation. Each engagement required a different proba- 
bility, depending on such factors as location, weather, type 
Of target, tactics, ete. In this case it was possible to de- 
fine the problem with four variables, each at four levels. 
However, two of the variables were discrete, so 16 sets of 
curves were required to express the relationships of the two 
remaining variables. In this case the investigator knew which 
variables were important. Analysis of prior simulations dis- 
closed the likely relationships and the number of levels need- 
ed. By discussion, we determined the boundaries within which 
simulations could be made. This experimental design produced 
satisfactory results for the problem at hand and pointed out 
future problems needing solution. The saving in computer 

time was large and results were better than individual simu- 
lations because much of the variation due to Monte Carlo sim- 


ulations was smoothed out in the curve fitting. 


Example 4. Large Scale Conflicts 


The selection of variables for conflict by the Center for 
Naval Analyses has been considered over considerable time and 
from various vantage points, mainly from that of the higher 
Naval commands. Some of the important factors are given in 
Mali ie 

These variables are often a spectrum of possibilities and 
in some cases, such as force levels and technical capabili- 
ties, are a collection of variables. Existing forces were 
regarded as one level of a variable and other potential forces 
that might be built in the future were another level. The 
future Naval force is subject to variation in the number of 
ship types, aircraft, and submarines. Other varibles exist 
but are excluded or not named explicitly, such as the examin- 
ation of the effect of chemical and biological warfare. The 
relevance of this study to economic problems faced by the 


Navy should be evident. 


WD John L. Overholt 





Table 1 


Typical Scenario Factors in Maritime Conflict 


Number of 
Element/Variable Levels Symbol 


Enemy force levels 
Enemy force capabilities 
Enemy force disposition 
U.S. force levels 
U.S. force capabilities 
U.S. force dispositions 


U.S. alliances zs G 
Enemy alliances Z F 
Veanon sCommlalet 2 ~ 
Warning time of outbreak 2 = 
Weapons: conventional/nuclear 2 E 
Intensity of sea war 3 J's Dales 
Intensity of land war 3 Koi Kae 
Enemy strategies 2 A 
U.S. sitrategies 2 D 

2 

2 

2 

2 

2 

2 


When each combination of factors is assembled, it repre- 
sents a set of conditions for a conflict. To evaluate the 
conflict, a scenario is required which will plausibly pit the 
forces against each other. 

Figure 2 shows a set of 288 possible conflict scenarios 
with 2 or 3 variations in each of the scenario elements from 
Table 1. Each box on the chart is one point in the scenario 
space; i.e., one combination of variables, and one individual 
scenario for potential analysis. This paper considers how 
to invoke the principles of experimental design with intui- 
tive judgments to arrive at a more efficient sampling of 
scenarios from the large group. 

In setting up Figure 3, only extreme cases within the spec- 
trum were considered first because a satisfactory solution 
under adverse and favorable situations should disclose the 
magnitude of problems to be faced. Intermediate cases are 
almost infinite, so we proposed concentrating on the ex- 
tremes for allianées: "no allies” and “alll allies! ANshort- 
hand convention is to designate one extreme as -G and +G for 


U.S. alone and U.S. with allies. The enemy acting alone or 


Factor Selection WS 


a eal 


with allies are identified as F and -F. Nuclear weapons are 
indicated by E and conventional weapons by -E. Three con- 


flict intensities at sea are none, J low-level intensity 


0? 


war, J and high-intensity war, J5- Land campaigns are de- 


} 
ee similarly for none, minor and major size (Ko > K\,; 
K,). Four enemy strategies were considered. The effect of 
changing the year and the U.S. strategies were not examined 
in the set in Figure 2. 

In Figure 3, 128 scenarios of interest on the basis of 7 
factors are enclosed in heavy lines. They exclude all scen- 
arios designated Jo (mo sea campaign), and Ko (no land cam- 
paign). More than half of these (68), which are hatched, can 
be excluded from immediate consideration as impossible or 
highly unlikely. For example, a high-intensity war at sea 
with a major land campaign and no U.S. allies (-G, J, K,) 
seems implausible now. If limited war at sea were in progress 
concurrently with a major land campaign (Jy, K,), escalation 
of the sea war would be expected so this case is a transition- 
al phase rather than a "steady state'' war and is excluded. 
Nuclear weapons are not considered likely for minor wars (E, 
Jo» Ky) > hence are eliminated from current consideration. 

One experimental design for studying scenario variations 
of interest in high-intensity maritime conflict is shown in 
Figure 3. The run identifications are arbitrary. Thus we 
have used experimental design concepts in planning which were 
reflected in the actual runs made but because of some of the 
difficulties listed later, the trial design presented here was 
not used. 

At a later stage it was necessary to curtail the number of 
cases because of the time to evaluate a case. Although we 
wanted to examine the effect of distance from our own and 
enemy bases to the scene of conflict, it was necessary to 
change the nature of the variable. Discrete scenarios for 
typical situations involving the distances desired were pre- 


pared and treated as blocks rather than using distance as a 


*Z 9in3sty 


ee: 
BEE EERE EEE EEE 
eee 


ee 


enn oes s 
ee ee a 




















qd = 
i a 
9 2) 


“SLOJDEF OLTIVUSDS SATF IOF USTSep [equeutTsredxg “¢ sunsTy 






Q+ ‘d- 
daaems msy ‘AoAuo0D 


Omeecs 
sutdaams ou ‘AOAUOD 


D+ [e+ 3 a= SVE 
sutddtys [ew1oN 








+ 
SOTTIV 
sntd *s‘fn 





wee 
ee 











m 
a 









4 


q+ 
suodeoy 
IeOTONN 





ap 









Oe adie 
daoms msy ‘AoAu09 









+ 
SOTTTY owos 
ssetT ‘S°n 








Oee dee 
Sutdaams ou ‘Aoauo) 


oy ES 1 Ge 
Sutddtys [Tew1O0N 














ft 
— 
nw 






Ome ahs 
daoms nsy ‘AoAUOD 





oO 
a 





Di- 
SOTTIV 
sntd *s'f 









qd- 





suodeo 
yTeuotjusAuoD 


oop 
N 
a 






9 - 
SOTTIY ouos 
sseT *Ss’n 





eer 
m 
AO 
OU + 
qd 
rl 
on 
a 


vt 








SeTITTOd “S'N 
Do 
a A80}"145 Awoug 














J- S®TTTY sntd Awoug auote Awoug 


76 John L. Overholt 





parametric variable. In other cases, we wished to examine the 
consequences of uncertainties in estimates of future perfor- 
mance of military systems which are now in the design state 
for both the U.S. and potential enemies. In these cases, 


1 


"best," "optimistic,'' and "pessimistic" performance estimates 
were used. There were so many factors involved that it was 
necessary to test some factors individually and in other cases 
to determine the outcomes for entire groups of equipment. 

In practice, most of the changes in this program were made 
one-at-a time for various reasons. The results were analysed 
and new tests of sets of conditions were proposed, namely, 
those which were best for each of the two sides, and a third 
case in which the best posture for the U.S. in the face of 
"best" efforts of the enemy. 

The large scale simulation produced results which served 
as guidance for decisions about future equipment in the plan- 
ning and design stages--particularly in isolating critical 
uncertainties which must be resolved. Because of these un- 
certainties, the best political/military action cannot be pre- 
dicted but considerable insight into these problems was ob- 
tained. 


Difficulties in Experimental Designs 


There are two types of experiments with computer simulation 
experiments: (1) relatively simple cases such as examples 2 
and 3 in which the experiment can be designed in a manner very 
Similar to agricultural or industrial experiments except for 
residual error and boundary problems; and (2) very complex 
experiments exemplified by the last case 4 which has special 
problems, 

The difficulties which stem from both the factors and the 


Simulation (in contrast to real life experiments) are: 


Factor Selection Wl 


nn ne ee SE EEE IIIa SSS 


Residual Error 


Residual errors do not exist which can be ascribed to fac- 
tors which have not been tested. Although experimenters may 
be frustrated by laboratory experiments that do not go as 
planned because of an unrecognized factor, the deviation from 
expected results is available as a clue for improving the ex- 
periment. 

In contrast, computer experiments may be in error if fac- 
tors are omitted and only by real-world validation followed 


by computer correction can the experiment be improved. 


Assymetry 


Given two factors there are four cases which can be exam- 
ined: AB, AB, BA and AB where the bar means "not." In many 
situations, some of the 4 cases are of no interest: both 
sides in a contest will avoid some of the 4 cases as exempli- 
fied in Figure 2. Hence, one can prejudge that evaluation of 
the unwanted cases will be of no value to the client. 


Boundary Conditions 


The proper bounding of factors in the design is a recog- 
nized problem, especially in industrial experiments. However, 
the bounding problem is more severe in simulations. For ex- 
ample, in simulations probabilities may go to one or zero 


and render conventional analysis difficult. 


Stepwise Changes 


Most conflict situations require periodic decisions in 
which the variables are changed in a direction--we hope-- 
favorable to the player making the change. Most simulations 
do not have the stop-change-start capability, and can affect 
the type of experimental design. 


78 John L. Overholt 


Summary 


The aim in real world experimental programs is to obtain 
a correct understanding of a problem in full context and as 
soon as possible. In practice the truth is approached in a 
series of steps. The results must be presented so they can 
be understood and used by the clients (managers). The fol- 
lowing steps are suggested when choosing the factors for com- 
puter simulation: 

Investigate the kind of answer which will satisfy the 
client before starting large scale experimental work. 


Determine how much time and money is available for com- 
puter time. 


Make a list of the factors considered important by the 
local experts and rank them in importance. 


Examine the known relationships between the factors and 
the outcomes. Are they linear, logarithmic or polynomial, 
or unknown? 


Consider whether variables act jointly and where a pool- 
ed group of variables can be used. 


Examine existing data and programs to learn the boundar- 
ies for each variable and residual errors. Review the 
critenva for reporting results). 


Determine the smallest homogeneous set of factors to be 
explored. 


Prepare trial plans and explain to the sponsor how the 
proposed work is to be done to ensure that all necessary 
factors are present. The exploratory steps may require 
several stages of planning and discussion before a good 
design containing the factors of real importance is at- 
tained. 


The final design is then prepared, culminating consid- 
erable preplanning and consultation. 


The analysis may disclose a combination of factors 
which can be used in future runs. 


Bibliography 
1. Bonini, C.P. Simulation of Information and Decision Sys- 


tems in the Firm. Englewood Cliffs, N.J.: Prentice-Hall, 
nic hRmELo ODE 


Factor Selection 79 
ee SO ee 2 eee ee eee 


Cochran, W.G., and Cox, G.M. Experimental Designs. New 
York: John Wiley §& Sons, 1957. 


Cox, D.R. Planning of Experiments. New York: John Wiley 
Goons. Lobicre 


Daniels. Whactomescreeninon sin Process Deve LoD t 
Industrial and Engineering Chemistry, LV (1963) 


Gorman, J.W., and Toman, R.J. “Selection of Variables for 
Fitting Equations," Technometrics, VIIT (1966), 27-51. 


Hocking, Reka, and Lesive, RIN. “Selection of the Best 
Subset in Regression Analysis," Technometrics, IX (1967), 
531-540. 


Kempthorne, Oscar. The Design and Analysis of Experiments. 
New York: John Wiley & Sons, 1952. 


Naylor, Thomas H., Burdick, Donald S., and Sasser, W. Earl. 
"Computer Simulation Experiments with Economic Systems: 
The Problem of Experimental Design," Journal of the Amer- 
ican Statistical Association, LXII (December, 1967), 

SUS SUSSiie 


Wilson, E.B. Introduction to Scientific Research. New 
Yorks MeGraw=HiliyBoolkNGor ye LoS2e 


Response Surface Designs 


Donald S. Burdick and Thomas H. Naylor, 
Duke University 


Introduction 


The fundamentals and underlying philosophy of response sur- 
face methodology (RSM) were first set forth by Box and Wilson 
[17] in 1951. Since that time response surface methods have 
become well known to statisticians and certain physical scien- 
tists (chemical engineers in particular). Response surface 
techniques have not been widely used by economists and manage- 
ment scientists. 

The reason for this lack of enthusiasm for RSM on the part 
of economists and management scientists is fairly obvious. RSM 
was developed primarily as a tool for designing experiments and 
in particular, experiments with physical processes. Since it 
is usually impossible or impractical to perform controlled ex- 
periments with business and economic systems (e.g., firms, 
industries, and the economy as a whole), it is not surprising 
to find that economists and management scientists have shown 
only limited interest in experimental design methods in general 
and RSM in particular. With the advent of computer simulation 
techniques it has become possible to perform a type of pseudo- 
experiment in economics and management science. That is, it is 
now possible to perform simulation experiments with models of 
business and economic systems [39,40]. 

RSM offers a useful approach to the problem of designing 
computer simulation experiments with models of business and 
economic systems. RSM has been used in the simulation studies 
of Hufschmidt [31], Meier [38], and Preston and Collins [41]. 
The work of the California Analysis Gentex, ine. (G.4.G.u)mon 
SimOptimization appears to offer considerable promise. The 


Response Surface Designs 81 


<< 


objective of the SimOptimization research is to develop effi- 
cient, economical techniques for locating improved (but not 
necessarily optimum) solutions to simulation models where 
analytical optimization techniques cannot be realistically ap- 
plied [33,37]. In this paper we begin by defining and classi- 
fying the important concepts of RSM and discuss some general 
considerations relevant to the experimental investigation of 
response surfaces. Several important response Sumrace tech- 
niques are described in detail. A section is also included 
which relates RSM to other exploration and optimization methods 
which are well known to economists and management scientists. 
An example model is included as well as a section which out- 
lines several unresolved problems in applying RSM to business 


and economic problems. 


Some General Considerations 


A response surface can be described as a relationship be- 
tween a quantitative or numerical response and quantitative 
factors. The realtionship may be deterministic or it may in- 
volve random variables. The functional form of the relation- 
ship may be known explicitly, it may be known except for the 
values of several unknown parameters, or very little may be 
known about the response surface. The aim of the investigation 
might be to explore and describe the response surface over some 
region of interest in the factor space, or it might be to op- 
timize the response over some operability region in the factor 
space. In any case the basic feature of RSM 1S) the) expex1- 
mental investigation of a response surface using observations 
of the response at various factor levels as data. 

The experimental investigation of a response surface can be 
divided into a design phase and an analysis phase. In the de- 
sign phase decisions are made to set the factors at certain 
values and levels. This is equivalent to a selection of "de- 


sign points" in the factor space at which the "experimental 


82 D.S. Burdick and T.H. Naylor 
a EEE Eee 


runs'' are to be performed. In the analysis phase the response 
data from the experimental runs is analyzed. 

A selection of design points is called an "experimental de- 
sign.'' Experimental designs may be classified as "simultane- 
ous" or "sequential". In a simultaneous design although the 
experimental runs may be performed one at a time, the design 
points are selected simultaneously. In other words results 
from the earlier runs have no bearing on the selection of de- 
sign points for the later runs. In a purely sequential design 
the runs are performed one at a time and each design point is 
selected using information from previous runs. 

Simultaneous designs are frequently used when the aim is to 
explore the response surface, while sequential designs tend to 
be used to find the location of an optimum response. However, 
whether the aim is to explore or to optimize, a complete exper- 
imental investigation involving a pure simultaneous or a pure 
sequential design is rare. An exploratory investigation is 
likely to cycle through the phases of design and analysis sev- 
eral times before it is done, and an optimum seeking experiment 
is likely to have stages at which simultaneous designs are used 
for a local exploration of the response surface. 


An Outline of Response Surface Techniques 


We begin with a consideration of simultaneous designs. The 
most widely used types of simultaneous designs are the factor- 
ial and fractional factorial designs. 

In the factorial design several values or "levels" are 
chosen for each factor in the experiment. By assigning to each 
factor one of its levels we generate a design point. If all 
the design points obtainable in this way are used, we would 
have a (full) factorial design. Factorial designs [23, pp. 335- 
354] attempt to cover the relevant range of a factor by a ser- 
ies of uniformly spaced values. 

The total number of design points in the full factorial de- 


Response Surface Designs 83 


$$ 


sign is the product of the numbers of levels for each factor. 
For example, a four-factor experiment with factor A at 2 lev- 
els, factor B at 2 levels, factor C at 3 levels, and factor D 
ae 4 tevels requires 2x 2x 3 x 4 = 48 design points. As the 
number of factors and the number of levels per factor increase, 
the number of design points required for the factorial grid 
increases markedly. 

It is apparent that the full factorial design can require an 
exceedingly large number of design points if more than a very 
few factors are to be explored. If we require a complete in- 
vestigation of the factors in the experiment, including main 
effects and interactions of all orders, then there is no solu- 
tion to the problem ofs"too many factors." If, however, we 
are willing to settle for a less than complete investigation, 
perhaps including main effects and two-factor interactions, 
then there are designs which will accomplish our purpose and 
which require fewer design points than the full factorial. 
Fractional factorial designs [23,26] are examples of designs 
which require only a fraction of the design points required by 
the full factorial design. Bonini [2] used a fractional factor- 
ial design in his computer simulation experiments with a model 
Of a firm: 

Selection of design points according to some random sampling 
scheme is another type of simultaneous design. The simplest 
such scheme would utilize the uniform distribution to sample 
design points "at random" from some region in the factor space. 
The uniform sampling scheme is appropriate in the absence of 
priori information about the response surface, but in such 
cases it tends to be less efficient than the factorial or frac- 
tional factorial designs. 

If prior information about the response surface is available, 
it can often be utilized to improve the sampling scheme. For 
example, if it is known that over some areas of the region of 
interest the response surface is likely to be bumpy and highly 


variable while over other areas the surface is expected to be 


84 D.S. Burdick and T.H. Naylor 


ee ee ee 
smooth, then a stratified sampling scheme which samples the 
bumpy areas more frequently than the smooth areas would be in- 
dicated. 

On the other hand if enough is known about the response sur- 
face, then a deterministic design may again be best. In the 
extreme case where the response surface is given by an explicit 
function, it might even be possible to forego experimentation 
altogether in favor of the techniques of mathematical analysis. 

As a general rule to which there will undoubtedly be excep- 
tions we can state that random designs are not best when we 
know either too little or too much about the response surface. 
If we know something about the response surface, possibly at 
an early to middle stage of the investigation but are uncer- 
tain as to how to exploit our knowledge with a deterministic 
design, it may be that random design methods will prove use- 
Lule 

Rotatable designs are designs which have a center point and 
for which the standard deviation of the fitted response at any 
point in the factor space depends only on the distance of the 
point from the center point and not on its direction. The de- 
sign points can therefore be rotated about the center without 
changing the variance of any point in the fitted response sur- 
face. Rotatable designs require quantitative factors whereas 
factorial designs do not. By utilizing the numerical proper- 
ties of the factors, rotatable designs can provide virtually 
the same amount of information about the response surface with 
fewer design points. 

A fairly simple type of design is the cube plus star plus 
center points. If we take the origin as the center jorsehemee- 
sign, the cube portion will just be a two level full factorial 
with each factor at the levels -a and +a. Writing k for the 
number of factors, we can designate the a. design points in the 
cube by (+a,t+a,...,+a) where it is understood that each combin- 
ation of plus signs and minus signs yields one of the points. 
The) star portion, consists of stheyZk poamitse(Eb)0n iO) mors 
=D, 0), 4-0 90) 5 0+ 4 00l,.04,05b)).. hel cenitier jpounitsmconshsiemorts Ny 


Response Surface Designs 85 


runs at the center of the design, i.e., the origin (0,0,..,0). 
k/4 
2 : 


Before taking up the topic of sequential optimum seeking 


The condition for rotatability is that b/a = 


designs we should perhaps mention the relevance of regression 
analysis to RSM. Least squares regression analysis is a method 
for fitting a response surface to observed data in such a way 
as to minimize the sum of squared deviations of the observed 
responses from the value predicted from the fitted response 
surface. When the mathematical form of the response function 
is unknown, it can sometimes be approximated satisfactorily, 
within the experimental region, by a polynomial. Designs which 
allow estimation of the parameters for first-order polynomials 
are called first-order designs and designs which allow estima- 
tion of the parameters of second-order polynomials are called 
second-order designs, etc. Chapter 8A of Cochran and Cox [23] 
contains an excellent exposition of first-order and second- 
order designs. The paper by Hill and Hunter [30] includes a 
survey of third-order designs. 

We now turn to the topic of sequential designs used in op- 
timum seeking experiments. For definiteness we shall assume 
that the optimum occurs when the response is maximized. With 
the obvious changes in wording the discussion will apply equally 
well when the minimum response is optimal. 

One way of finding a maximum is to conduct a thorough explor- 
ation of the response surface. As we have already seen, if 
there are more than just a few factors, this process can require 
a very large number of design points. 

The problem here is the result of a multidimensional factor 
space. A thorough exploration requires us to look in many 
directions. If instead we only had to look in a single direc- 
tion along a one dimensional line, we could get by with far 
fewer design points. In a maximum seeking experiment our ob- 
jective is to climb the hill to the top. If we can find a 
direction which will take us up the response surface at a good 
rate, then we are happy to explore in that direction and forget 


about the many others. The substantial saving in design points 


86 D.S. Burdick and T.H. Naylor 


from exploring in one direction instead of many is the prin- 
ciple behind the- sequential optimum seeking designs of RSM. 

The method of steepest ascent is one of the two major op- 
timum seeking methods of RSM. We require a design point in 
the factor space as a starting value for the procedure. As a 
first step we need a linear approximation, i.e., an approxi- 
mation hyperplane, to the response surface in the neighborhood 
of the starting value. Unless the response surface is a known 
function, we must explore the neighborhood of the starting 
value in order to fit an approximating hyperplane. A fractional 
factorial design or first-order rotatable design might be used 
to obtain a linear fit by least squares. 

The next step is to explore along the direction of steepest 
ascent as determined by the approximating hyperplane. Design 
points would be chosen along this direction until a point is 
reached at which no further progress seems likely. At this 
point a new local exploration is performed with the possible 
result that a new direction of steepest ascent will be obtained. 

After several cycles of this process we are apt to reach a 
point at which the local linear fit is a nearly horizontal hy - 
perplane with no direction of steepest ascent. This will be 
the case if we have reached the top of the hill. It can also 
happen if we have reached a saddle point which is a maximum in 
some directions but a minimum in others. As a final step, 
therefore, we would explore the neighborhood of the apparent 
maximum more thoroughly in order to fit a local quadratic ap- 
proximation to the response surface. This could be done using 
a second-order rotatable design or perhaps a three level frac- 
tional factorial design. If the approximating quadratic is 
negative definite, then we have in fact reached the top of a 
dele 

The single-factor method is the second major optimum seek- 
ing method of RSM. In this method the level of a single fac- 
tor is varied while the levels of all other factors are held 
constant. When no further improvement is possible, the single 
factor is held fixed at its best level and another factor is 


Response Surface Designs 87 


chosen to be varied. After this single-factor search is per- 
formed for each factor in the experiment, the cycle begins 
again by varying the original factor once more. The process 
ends when no change in the level of any single factor leads to 
anmanereasie) inthe Wevel of the mesponsie. 

The single-factor method has at least two advantages when 
compared to the method of steepest ascent. One is that it does 
not require even local exploration at intermediate stages of 
the process. Since exploratory designs require so many points, 
this could be a substantial saving. A second advantage for the 
single-factor method occurs when it is markedly easier to vary 
one factor at a time than it is to vary several factors simul- 
taneously. This might be the case for example in an experiment 
using some engineering device (an analog computer, perhaps) 
where factor levels correspond to settings on a dial. 

The single-factor method also has some disadvantages in com- 
parison with the method of steepest ascent. The search can 
move only in directions which are parallel to the coordinate 
axes. As a result it is possible for the search to terminate 
on a sharp ridge which is rising in a direction not parallel 
to any axis. This difficulty can be overcome by exploring the 
neighborhood of the apparent maximum after the search termin- 
aces SA Mone servous dittrvculty 2s the possibidaty that the 
search process will not terminate on the ridge but will zigzag 
slowly up it in very small steps. The zigzagging could easily 
consume many more design points than would be required for the 
local exploration stages of the method of steepest ascent. 

Sequential optimum seeking methods are essentially hill- 
climbing techniques. If there are many hills in the response 
surface and we are climbing one of the smaller ones, there is 
no way for these methods to inform us of that fact. If the 
response surface may be multimodal, one should probably perform 
a preliminary exploration to get the lay of the land before 
selecting a starting value. 


As was mentioned above, once a maximum or apparent maximum 


88 D.S. Burdick and T.H. Naylor 





has been found, it is desirable to approximate the response 
surface in the neighborhood of the maximum with a quadratic 
function. Canonical analysis is a method of data analysis 
which is helpful in interpreting the approximating quadratic. 
The technique involves a rigid linear transformation in factor 
space to diagonalize the matrix of the quadratic form. It is 
quite similar to a principal components factor analysis. If 
the eigen-values on the diagonal of the transformed matrix are 
all negative, the stationary point is a true maximum. If some 
eigenvalues are positive, it is a saddle point. If some are 
zero or near zero, it is on a stationary or nearly stationary 
ridge in the direction of the corresponding eigenvector. Neg- 
ative eigenvalues of large magnitude indicate a sharp dropoff 
from the maximum in the direction of the corresponding eigen- 
vector. The technique of canonical analysis has been described 
by Box and Wilson [17] and Cochran and Cox (2s 

There are many situations where we are interested in optim- 
izing or improving the response, but where we are not free fo 
explore at will because of the high cost of obtaining a poor 
response. This would be true if we were experimenting on an 
industrial production process and the firm's profits depended 
heavily on the values of the response. A more extreme example 
is the national economy (the real thing, not a model!), where 
a change in government policy variables "just to see what hap- 
pens'' may prove to be quite costly. 

In these situations a starting point is given by current 
operating conditions. It may be possible by making small 
changes in the factors to find a direction which will improve 
the response, and to move gradually along that direction. This, 
in essence, is the technique of evolutionary operation [Des reg Zeal 
29,32,46]. Evolutionary operation can also be used to follow 
an optimum which is changing with time. 

Of course, in computer simulation models we are usually fmee 
to explore the response surface widely and are not bound by con- 
straints that are in force in the real world systems they model. 


Response Surface Designs 89 


Therefore, evolutionary operation is seldom needed in optimum- 
seeking experiments on simulation models. However, the per- 
formance of a evolutionary operation plan for a real world 


system might well be simulated on a computer. 


RSM _ and Other Optimization Methods 


Having defined RSM and having described several response 
surface techniques, we now turn to a comparison of RSM with 
other optimization methods such as classical optimization and 
mathematical programming. In analyzing the properties of re- 
sponse surfaces we encounter two important cases. In the first 
case the exact form of the response function is known. That 
is, the response function is a given explicit function. It is 
this case with which economists and management scientists are 
most familiar, for it is common practice in business and econ- 
omics to analyze a given utility function, cost function, rev- 
enue function, etc. In the second case the exact form of the 
response function is unknown and the response function is 
merely an implicit function. This case arises frequently with 
computer models of economic systems. With complex computer 
simulation experiments the response function may be too compli- 
cated to describe explicitly, unless we consider the entire 
computer program as the response function. 

When we are dealing with an explict response function, we 
may take either an analytic or an experimental approach to op- 
timization. An attempt to find the optimum as a stationary 
point of the response function by setting partial derivatives 
equal to zero and solving the resulting simultaneous equations 
is an example of an analytic approach. Of course, if the sys- 
tem of simultaneous equations is too complex, this method may 
be unfeasible. 

Another possibility would be to approach the optimum sequen- 
tially by evaluating the partial derivatives at a selected 
point and using them to obtain a promising direction along which 


to select the next point. This is a method which combines 


90 D.S. Burdick and T.H. Naylor 





analysis and experimentation. Obtaining formulas for partial 
derivatives from an explicit response function is an analytic 
technique, but the evaluation of these derivatives at speci- 
fic points and the use of these values to guide the search 
for the optimum is experimental in nature. 

As a modification of the above method one could approxi- 
mate the derivatives experimentally by making small displace- 
ments in the factors and evaluating the resulting change in 
the response. By now we have made the optimization method 
almost purely experimental, and at this point, not surprising- 
ly, there is a large overlap with RSM. The experimental approx- 
imation of partial derivatives is tantamount to the experi- 
mental determination of an approximating tangent hyperplane. 
The gradient direction obtained from the approximate partial 
derivatives is exactly the direction of steepest ascent ob- 
tained from the fitted tangent hyperplane. 

Before moving to a consideration of an example simulation 
model, we should discuss, at least briefly, the concepts of 
experimental error and specification error [44]. Experimental 
error occurs when disturbances in the data cause our estimates 
of the parameters in the response function we are trying to 
fit to differ from the desired or "correct" values of those 
parameters. Stochastic generation of data is a common source 
of experimental error, but round-off error can also contribute 
a significant amount to experimental error on occasion. 

Specification error occurs when the true response function 
is not a member of the parametric family of functions which 
are under consideration. For example, we may be fitting a 
polynomial to a response function which is not of the poly- 
nomial type. In this situation even the best choice of param- 
eters will yield a function which does correspond exactly to 
the true response function. The resulting deviation is called 


specification error. 


Response Surface Designs out 


An Example Model 


To illustrate the application of RSM to economics we shall 
consider a computer simulation model. The model is a sto- 
chastic model of a multi-product, multi-factor firm. Although 
the model is hypothetical and relatively simple, it does pos- 
sess some of the characteristics of more complex economic 
models. The variables and equations for the model are listed 


below: 


Endogenous Variable 
Teas DRO ttt Pema Oden: 

State Variables 
R. tL Oualaane Venile mame er 100 dart 
C, Se EOtalecOs tain penatod it 


Policy Variables 


Qt = quantity of the ith product produced and sold 
inl jOerenO@Gl (5 Wines shes W5A55 5450 
Xijt = quantity of the jth factor input used in the 


production of the ith product in period t, 
esa Om ctl) reueweyrs Tima re cues ley yr merece 


Behavioral Equations 
Revenue Function. 
(16) Ry = Re (Qa Er Qog res + Que be) 


Cost Equation. 


a7) eae CeO gee Xap ge ee Xe Mt) 
Production Functions. 

oe) We ~ GeO eekizes -- Mine Me) 

Identity 


lis) Ty = R, a C, 


92 D.S. Burdick and T.H. Naylor 


The disturbance terms y and v are assumed to be normally 
distributed with zero mean and given variances. The wi's are 
assumed to have a multi-variate normal distribution with zero 
mean and a given variance-covariance matrix. 

Suppose that we are interested in evaluating the effects of 
our m+n policy variables (factors) on the single response vari- 
able, total profit. Assume that we are interested in a plan- 
ning horizon of length T. The values of the policy variables 
are assumed to be fixed over a given planning horizon. That 
is, they do not vary between periods within T. Our simulation 
experiment consists of runs of length T for different values 
of the policy variables. 

At the beginning of each simulation run we fix the values 
of Q; and XG for all i and j. These values, of course, must 
satisfy (18). Each period during the run we compute the value 
of the firm's profit according to (19) in terms of the given 
values of the policy variables and the stochastic variates. 
The stochastic variates are assumed to have been generated by 
normal and multivariate normal subroutines [53, pp. 97-99] 
respectively. Thus each run consists of the generation of T 
values for T. At the end of each run we compute total profit 
TP according to the formula 


it Ty 


t=1 (l+r) 


where r is the rate of time preference. 
Thus it can be seen that even for this relatively simple 


example the response surface 
(21) TP = TP(Qy ose 2QyrXyq0--- 9% ) 


may be extremely complex. (The degree of complexity depends 


mn 


on the form of the revenue, cost, and production functions.) 
In fact, the only way in which the response function can be 
described explicitly is by considering the entire computer pro- 
gram for the simulations as the response function. 

We have assumed for the sake of illustration that the fac- 


Response Surface Designs 93 





tors are to be held fixed for the entire planning horizon and 
that our objective is to find the levels of these factors which 
will maximize total profit. Current real world values of the 
factors might serve well as a starting point. As a UGE Nem 
simplification we assume that our hypothetical firm produces 

a single product from only two factors of production, e.g., 
labor and capital. This in effect reduces our simulation ex- 
periment to an experiment involving only three factors. 

The steepest ascent method would be preferred to the single 
factor method in this example for several reasons. First, with 
only three factors in the model a two level full factorial 
requires only eight design point. Thus, a local linear fit, 
which the method of steepest ascent requires, could be made 
economically. Second, it seems likely that the factors are 
strongly interrelated and that the response surface may have 
ridges running in directions not parallel to any coordinate 
axis in the factor space. For this type of response surface 
the method of steepest ascent usually works much better than 
the single factor method. Finally, since this is a model for 
simulation by a digital computer, there is no particular advan - 
tage to varying the factors one at a time. (The book by Lavi 
and Vogl [34] suggests a number of other techniques including 
conjugate gradient methods which might be used with this 
model.) 

Once, we have reached an apparent maximum, we would explore 
the vicinity with a second order rotatable design centered at 
the point of maximum response and fit a quadratic function to 
our data. In our three factor example a Cube plus star design 
with three center points would require qe 2 708) 2 SB = Wy Glesalean 
points. The final step would be to perform a canonical medi 
tion of our fitted quadratic to check whether the apparent max- 
imum is a true maximum and to examine the sensitivity of the 
response to departures from the maximum point in various direc- 
tions. 

We conclude this section with a word of caution. Like any 


model an economic model is not a perfect representation of the 


94 D.S. Burdick and T.H. Naylor 





object it models. Real world data is used to estimate the 
parameters in the behavioral equations. This data will include 
values for the variables which are the factors for our experi- 
ment so that there are certain real world data points in the 
factor space. Now there are two response surfaces we can con- 
sider: one in the real world relating the real world response 
to real world factors and the other is the corresponding re- 
sponse surface in the model. It is our hope that the two will 
be very close, but this is much more likely to be true near 
the real world data points than far away from them. If our 
search leads us to an optimum point far from the real world 
data points, any resulting conclusions about the location of 
the optimum point in the real world should be made with great 


caution. 


Some Unresolved Problems 


In reviewing the literature on response surface methods and 
relating it to economics and management science one cannot help 
but be impressed with the number of unresolved problems which 
remain for econometricians and management scientists to pursue. 
We shall briefly summarize some of these. 

1. We have previously alluded to the problem of '"'too many 
factors" in our discussion of factorial designs. Unfortunately, 
this problem is relevant not only to factorial methods but to 
response surface methods in general. Although high-speed com- 
puters and fractional factorial designs partially alleviate 
this problem, they by no means eliminate it. 

2. When curve fitting is an integral part of the analysis of 
an implicit response function the problem of nonlinearity may 
arise. If the response function is merely nonlinear in vari- 
ables, then polynomial approximations and various transforma- 
tions may provide relatively easy solutions to the curve-fitt- 
ing problem. If the response function is nonlinear in param- 


eters, then the estimation problem becomes very difficult. 


Response Surface Designs 95 
a eee ee 
3. Throughout our discussion of RSM and related techniques 
we have made direct reference to the problem of convergence. 
Under what conditions will a particular response surface method 
converge on a global optimum? How quickly does the method con- 
verge? In many cases, the literature describing the convergence 
properties of the various RSM techniques has been restricted to 
comparisons of several techniques using very special examples. 
It is difficult to generalize on the convergence properties of 
alternative RSM techniques when one restricts himself to a small 
sample of special cases. There appears to be a need for exper- 
ience with practical examples to help the users of RSM techniques 
decide when a particular technique is more appropriate than an- 
other. It is well known that the whole question of non-convexity 
and speed of convergence for optimum seeking methods needs fur- 
ther study not only within the RSM framework but within classi- 
cal optimization and mathematical programming frameworks. 
4. Constrained optimization of implicit response functions 
is an area which seems to have been almost completely ignored 
by researchers in the field of response surface methodology. 
This topic is particularly relevant to computer simulation exper- 


iments with models of economic systems. 


Bibliography 
1. Baasel, William D. "Exploring Response Surfaces to Estab- 
lish Optimum Conditions," Chemical Engineering, (October 


258 1965)5 


2. Bonini, Charles P. Simulation of Information and Decision 
Systems in the Firm. Englewood Cliffs, N.J.: Prentice- 
at ree iia mol or 


SueBosies ekuGe eandeCanter, Rel. Weompilete Representation in 
the Construction of Rotatable Designs," Annals of Mathe- 
Matucals Sitatastlcs) XOX (CLO59)) 5 


4. Bose, R.C., and Draper, N.R. "Second Order Rotatable De- 
signs in Three Dimensions," Annals of Mathematical Stat- 
WSieELeS, YOOC CUES) 


5. Box, GHP. “Multvfactor Designs of First Order," Biome- 
trika, XXXIX (1952)i, 4957. 


96 


IL) e 


GINS 


eR 


SY. 


14. 


0S ye 


Oye 


a 


Sr 


OK 


20. 


D.S. Burdick and T.H. Naylor 


Box, G.E.P. "The Exploration and Exploitation of Response 
Surfaces: Some General Considerations and Examples," 
Biometrics, X (1954), 16-60. 


Box, G.E.P. "Evolutionary Operation: A Method for Increas- 


ing Industrial Productivity," Applied Statistics, VI 
(LOS « 


Box, G.E.P. "The Effects of Errors in the Factor Levels 
and Experimental Design," Bulletin of the International 


Statistical Institute, XXXVII (1961). 


Box, G.E.P., and Coutie, G.A. "Application of Digital 
Computers in the Exploration of Functional Relationships," 


Proceedings of the Institute of Electrical Engineers, 
CIII, Part B, Supplement No. I (1956), 100-107. 

Box, G.E.P., and Behnkin, D.W. '"'Simplex-Sum Designs: A 
Class of Second Order Rotatable Designs Derivable from 


those of First Order," Annals of Mathematical Statistics, 
XXXI (1960). 


Box, G.E.P., and Behnkin, D.W. "'Some New Three Level De- 
signs for the Study of Quantitative Variables," Techno- 
metrics, II (1960), 455-474. 


Box, G.E.P., and Draper, N.R. "A Basis for the Selection 
of a Response Surface Design," Journal of the American 
Statistical Association, LIV (1959), 622-654. 


Box, G.E.P., and Draper, N.R. "Choice of Second Order Ro- 
tatable Designs,"' Biometrika, L (1963), 335-352. 


Box, G.E.P., and Hunter, J.S. "Multi-factor Experimental 
Designs for Exploring Response Surfaces," Annals of Math- 
Swenenkeel SieeietGietes, LOOMI (Is), tess e471, 


Box, G.E.P., and Hunter, William G. "'The Experimental Study 
of Physical Mechanisms," Technometrics, VII (1965), 23-42. 


Box, G.E.P., and Lucas, H.L. "Design of Experiments in 
Non-Linear Situations," Biometrika, XLVI (1959), 77-90. 


Box, G.E.P., and Wilson, K.B. "On the Experimental Attain- 
ment of Optimum Conditions," Journal of the Royal Sta- 
tistical Society B, XIli ((19si)y,el-45n 

Box, G.E.P., and Youle, P.V. "The Exploration and Exploit- 
ation of Response Surfaces: An Example of the Link Be- 
tween the Fitted Surface and the Basic Mechanism of the 
System," Biometrics, XI (1955), 1289-325. 

Brooks, S.H. "A Comparison of Maximum-Seeking Methods," 


Operations Research, III (1959). 

Burdick, Donald S., and Naylor, Thomas H. "Design of Com- 
puter Simulation Experiments for Industrial Systems," 
Communications of the ACM, IX (May, 1966), 329-339. 


Response Surface Designs 97 





21. Carpenter, B.H., and Sweeney, H.C. "Process Improvement 
with Simplex Self-Directing Evolutionary Operation," 
Chemical Engineering, (July 5, 1965). 


Z2pGarGoll CG. Win cl bhne Created Response Surface Technique for 
Optimizing Nonlinear Restrained Systems," Operations 
Research, IX (March-April, 1961), 169-184. 


23. Cochran, W.G., and Cox, G.M. Experimental Designs. New 
York: John Willey G Sons, 19/57) 


24. Dorfman, Robert. ''Steepest Ascent Under Constraint," Sym- 
osium on Simulation Models. Edited by A.C. Hoggatt and 
F.—£. Balderston. Cincinnati, O.: South-Western Publishing 
(GOr5 IOS. 


25. Draper, N.R., and Smith, H. Applied Regression Analysis. 
New York: John Wiley §& Sons, 1966. 


26. "Fractional Factorial Designs for Factors at Two and Three 
Levels,"' U.S. Department of Commerce, National Bureau of 


Standards, Applied Mathematics Series 58, U.S. Government 
Printainp  Oelce mm NasiinptonsaChemoceptember I> 196g 


27 Goldtr1eld, Stephen M:; Quandt, Richard E.> and Jmotter, 
Hale F. "Maximization by Quadratic Hill-Climbing," Econ- 
onetmica, XXXIX Gully, 19166) > S4i—55i. 


28. Hartley, H.O. ''The Modified Gauss-Newton Method for Fit- 
ting Non-Linear Regression Functions by Least Squares," 
Technometrics, III (1961), 269-280. 


29. Healea, Gary F. Evolutionary Methods as Applied to Simula- 
tion Models. Unpublished Master of Business Administration 
Research Report, University of Washington, 1966. 


30. Hill, William J., and Hunter, William G. "A Review of Re- 
sponse Surface Methodology: A Literature Survey," Tech- 
nometrics, VIII (November, 1966), 571-590. 


31. Hufschmidt, M.M. "Analysis of Simulation: Examination of 
Response Surface," Design of Water-Resource Systems. 
Edited by Arthur Maass, et al. Cambridge: Harvard Univer- 
Sip” Press, IVoo- 


32. Hunter, W.G., and Kittrell, J.R. "Evolutionary Operation: 
A Review,'' Technometrics, VIII (August, 1966). 


33. Karr, Herbert W., et al. "Simoptimization Research: Phase 
I,'' California Analysis Center, November 1, 1965. 


34. Lavi, Abrahim, and Vogl, Thomas P. (eds.). Recent Advances 
in Optimization Techniques. New York: John Wiley & Sons, 
1966. 


35. Leon, A. "A Classified Bibliography on Optimization," 


Recent Advances in Optimization Techniques. Edited by 
A. Lavi and T.P. Vogl. New York: John Wiley & Sons, 1966. 


98 


Sis 


S87 


Sor 


40. 


41. 


42. 


43. 


44. 


4s. 


46. 


47. 


D.S. Burdick and T.H. Naylor 





Ling, T.Y. 'A Cluster Sampling Theory and Integrated Multi- 
Regression Methods for Dynamic Simulation Experimenta- 
tion." Unpublished manuscript, Advanced Systems Develop- 
ment Division, IBM Corporation, Washington, D.C., 1967. 


Luther, E.L., and Markowitz, H.M. "Simoptimization Research: 
Phase II," California Analysis Center, July 15, 1966. 


Meier, Robert C. "The Application of Optimum-Seeking Tech- 
niques to Simulation Studies: A Preliminary Evaluation," 
Journal of Financial and Quantitative Analysis, (March, 
LGIG Zin eSU— 5i0K 


Naylor, Thomas H., Balintfy, Joseph L., Burdick, Donald S., 


and Chu, Kong. Computer Simulation Techniques. New York: 
John Wiley & Sons, 19166). 


Naylor, Thomas H., Burdick, Donald S., and Sasser, W. Earl. 
"Computer Simulation Experiments with Economic Systems: 
The Problem of Experimental Design,"’ Journal of the 
American Statistical Association, (December, 1967), 1315- 
SIS 7k 


Preston, Lee E., and Collins, Norman R. Studies in a Sim- 
ulated Market. Research Program in Marketing, Graduate 
School of Business Administration, University of Califor- 
nia, Berkeley, 1966. 

Rogers, Peter P. Random Methods for Non-Convex Programming. 
Cambridge: Harvard Water Resources Group, Harvard Univer- 
sity, June, 1966. 


Saaty, Thomas L., and Bram, Joseph. Nonlinear Mathematics. 
New York: McGraw-Hill Book Co., 1964. 


Theil, H. "Specification Errors and the Estimation of Econ- 
omic Relationships," Review of the International Statis- 
lca wins tantuer s XexKuVin (1LO)Si/s) remeleSilie 


Umland, A.N., and Smith, U.N. '"'The Use of LaGrange Multi- 
pliers with Response Surfaces," Technometrics, I (August, 
OS.9)) 


Wilde, D.J. Optimum Seeking Methods. Englewood Cliffs, 
Nades PiecneteSa Enh gine yy WH! 


Wilde, D.J., and Beightler, C.S. Foundations of Optimiza- 
tion. Engillewood Clifts, NJ. : Prentice -Haliince ego, 


Sequential Designs 


Herman Chernoff, Stanford University 


Introduction 


Sequential analysis was originally introduced as an alter- 
native to fixed sample size experimentation to take advantage 
of the possibility that the first few observations provide 
enough information for it to be unnecessary to gather more 
costly data. In cases where there are alternative experiments 
which may be carried out to obtain additional data, the rela- 
tive goodness of these experiments may depend on facts which 
are unknown but on which the data has bearing. Thus as infor- 
mation accumulates, it may be used to help select subsequent 
experiments which may prove to be more efficient. 

We shall present an asymptotic theory of optimal sequential 
design of experiments after first introducing some results for 
fixed sample size problems. 

Finally these results will be briefly compared with some 
other approaches and the relevance to Monte Carlo experiments 
will be discussed. The main point of relevance consists of 
stressing the potential usefulness of combining several experi- 


ments. 


Fixed Sample Tests of Simple Hypotheses 


Consider the problem of testing an hypothesis Hy: EES) 
£, Ox) versus the alternative H,? 16(Ge)) = £,(x) where X1Xoo--6> 
xX are independent and identically distributed (i.i.d.) with 
probability density function f(x), and aT and f, are specified. 
The classical theory of testing hypotheses indicates that the 
optimal tests are the likelihood-ratio tests which consists 


100 Herman Chernoff 


of rejecting Hy if the likelihood-ratio A, is small enough and 


accepting Hy (rejecting Hy) otherwise. That is we reject Hy 


Tats n 
(1) d suena < k! 
n n _ 
ae) 
or equivalently if 
be = 
(2) ROR )) ae 
i=1 
where 
(3) Yeu=wlogfiOedy £0 


and kK = ne log ki. Let Py and Ey represent probability and 
expectation under the hypothesis H. Then the error probabil- 


ities are given by 


(4) E,, = PH, {reject H,}. 


By increasing k one increases the probability of rejecting Hy 


thereby increasing ¢€ but decreasing Eon: In a Bayesian con- 


In 
text, if one attached prior probabilities w, and w, = (1-w,) 


to) HE and) H>andcositss x= sand Tr, for making the wrong decision 


1 2 il 
when Hy and H, are true, the risk on expected! cost womeds proce- 


dure is given by 


(5) Ro Wn in eon 


and one should select k, to minimize Ro To study Ein 


asymptotically we observe [7,9] that if Zz is the average of 


and Eon 
n independent observations on a random variable Z with mean u, 
then 

(6) lim [- i log P{Z <a}] = m(a) = inf Ele, a for a<p. 


n its 


-nm(a) where m(a) 


Thus P{Z <a} approaches zero roughly like e 
depends on the moment generating function of Z and is monotone 
in a for a < uw. The application of this result suggests the 
choice of a value of Kk, between Ey, (Y) and Ey, (Y), so as to 


equalize the rates at which e and Con approach zero. It is 


in 


Sequential Designs 101 





possible to show that asymptotic optimality in the sense of 
minimizing Ry is achieved for i and then Ro» fl and Eon 
all approach zero roughly like e where 


(7) feet Von ani) ee, (xe, (x)ax] 
O<t<1 


The number I has some of the properties of a distance or 
information. If I is large it is relatively easy to discrim- 
inate between fy and f, and thus we may think of fy and f, as 
being far apart. 

If an alternative experiment were such that the data under 
Hy and H, had densities 81 and &> respectively and the corres- 
ponding value of I were doubled, then half as many observations 
would be required to achieve the same risk and error probabil- 
ities. In this sense I measures information or efficiency of 
experimentation. 

In the above development the requirement that the data xX, 
be real valued random variables was unimportant. For example 
the data could represent vector valued random variables. In 
particular suppose Xx; = (U; ,V;) where U; and Vv; are independent 
with densities 81 and hy under Hy and &> and h, under H,. Then 


it can be shown that 

(8) CUE) ps (CU), + 0) 

where these terms have the natural interpretation. Moreover, 
(@) LGUs) = ius 


Suppose that if one has available a choice of "elementary" ex- 
periments and must select a design consisting of a choice of 
n of these experiments to be performed independently, (repeti- 
tions allowed). Then (8) implies that an asymptotically op- 
timal experiment is constructed by selecting the one which 
maximizes I and repeating it n times. It may cause a loss of 
efficiency if two distinct but equally informative experiments 
are each performed n/2 times. 

We illustrate with an example. A component has unknown 


102 Herman Chernoff 
a 
reliability p which is Py = .9 under Hy and P2 = .8 under H,. 
An elementary experiment consists of a set up of r components 
in a series arrangement so that a failure results if any of 
the r components fail. Thus the probability of success is pa: 
Suppose that the cost of a trial is 1 + r if r Components are 
used. It is desired to test Hy Vs. H, using a large number of 

erwadiss 
Although the probability distributions are discrete the 
underlying theory applies. Here the integral in (7) may be 
regarded as either En, (e *”) or equivalently as Ey, ¢e (484%). 
For the experiment ey where r components are arranged in series, 
the information is given by 
1, = - log int tpg py * C7) a 
Although the cost per observation depends on the experiment, 
the above theory still applies with the minor modification that 
one must consider maximizing information per unit cost. 
Table 1 tells us that the information per unit cost, 


Lf (ire) is maximized for r = 3 


Table 1. Teor Information per unit 


cost for example above. 
(p,=-9,p,=-8) 


Yr it 2 3 4 5 6 8 10 


Te 0101. 0187 .0257 .0325 ..0361 (0396 {0458 gates 


T/C) .0055 .0061 .0064 .0063 .0060 .0057 .0049 .0041 


One aspect of design problems is the choice of an appro- 
priate sample size. Note that if the cost per observation is 
c, then the overall cost associated with using an experiment 
e, n times is roughly 


-nl 
cn +) re 


nl 


where re approximates R). For a good approximation r ordin- 


Sequential Designs 103 





arily should be replaced by a function of n of order me 
(This is of little importance in the ensuing calculus where 
the exponential term plays the dominant role.) This can be 


minimized by taking 


im = + loys (Galle) 


which approaches » as the cost per observation approaches zero. 


This gives an optimal overall cost of 
ert loguiel/e)) + r log (I/e),, 


the main part of which is (-c log c)/I which is inversely pro- 
portional to I and of the order of magnitude of -c log c as 
Ga Ol 

The results and the nature of the asymptotically optimal 
solution are insensitive to the initial values of the prior 
probabilities and to the losses Ty and r,. The first is not 
surprising in view of the fact that a single observation typ- 
ically has a substantial effect in replacing the prior prob- 
abilities by posterior probabilities. The second is not sur- 
prising in view of the exponential rate at which the error 
probabilities decline. 

As developed so far this theory is limited in that it applies 
to testing a simple hypothesis versus a simple alternative and 


that there is no allowance for sequential choice of experiments. 


Fixed Sample Designs for Estimating Parameters 


Suppose that for a given value of x we may observe 


Y= oy + 05x UL 


where u is normally distributed with mean 0 and variance 1. 
It is desired to estimate 65; the sillope of the line, using n 
observations, corresponding to values of x between plus and 
minus one. Which values of x provide data that will give the 
best estimate of 6,? It is well known that in an optimal de- 


Sign x = +1 is used for half the observations and x = -1 for 


104 Herman Chernoff 


a 


the other half (assuming n is even). 

We may generalize this problem as follows. Given an ele- 
mentary experiment e of the set ¢€ of available elementary ex- 
periments, we may observe a random variable Xe with probabil- 
ity density £(X,0,e) depending on e and an unknown vector 
valued parameter 6 = (61 5955-+++50,)- It is desired to esti- 
mate » = g(0) on the basis of the data obtained from a design 
D which consists of a choice of n experiments to be performed 
independently (independent repetitions allowed) from a set € 
of elementary experiments. There is no essential loss of gen- 


erality in assuming that = 6, and we shall do so. 


1 
The Fisher information matrix corresponding to an experi- 


ment e is defined by 
2 


{ d° log £(X | 
E A e 
8 06.06. 

cas) 


where E, represents expectation when 6 is the underlying value 





(10), «13 0e) i,j = lee 














of the parameter. We shall find it convenient to suppress 
the subscript 6. Under mild regularity conditions it can be 
shown that 


dlog ae »e) dlog ene Ome)! 
(Cali) I(e) - |e a : ae 








The Fisher information matrix is additive in the sense that 
if two independent experiments are performed, the correspond- 
ing information matrix for this large experiment is the sum 
of the two individual matrices. 

The main relevance of I derives from the fact that if e is 
performed n times, then as n + ~ the maximum-likelihood esti- 
mate Ce of 6 has the property that the limiting distribution 
of vas -6) is normal with mean 0 and covariance matrix Z = 
[1 Cejillteee Momeovien.y ine a technical sense which we won't de- 
scribe here, there is no estimator that can do better from the 
point of view of covariance [oie ahus lon the 1-1 element 
of the inverse of I, represents the variance of the best esti- 


mate of 64. 


Sequential Designs 105 





The right to use the term information for I derives partly 
from the additivity of I and partly from the fact that if the 
information is doubled, the same variance can be achieved with 
half the observations. 

Because of the additivity of information, the problem of 


selecting an asymptotically optimal design becomes that of 


choosing 
Mees ner 
D= (€].€75+++re,) to minimize I where 
I = I(e,) I(e,) rear pot Ie), 
or equivalently to minimize a where 
i 
(4); I = = [I(e,) + I(e,) ieee eee et I(e,)] 


is the average information matrix. 

It is not difficult to show that the randomized experiment 
where e5 is performed with probability P,> =P, = 1 has infor- 
mation matrix tp, I(e,). Thus the average information matrix 
J of (12) is the information matrix of a randomized experi- 
ment. Hence we may pose the optimum design problem as that 
of selecting the randomized experiment e, based on the set ¢€ 
of available pure elementary experiments, which minimizes 
im ey. 

It can be shown under mild conditions that such an optimal 
experiment can be found which is a randomized mixture of at 
most k of the pure experiments of e€ [5]. Asymptotically this 
is equivalent to selecting these pure experiments in certain 
specified proportions. The example of estimating the slope 
Obed stralghte lane iis a special case with k = 2. 

The following example is instructive. The probability of 


response to a dose of level x is given by 


(3) p(x) = o SE 


where uw and o are ynknown, and where O(u) = if o(t)dt and 
meen) 1 "ee" "4 


density functions. This model is related to the assumption 


are the standard normal distribution and 


that the dose required to make an individual respond is normally 
distributed with mean u and variance eee However, we cannot 


106 Herman Chernoff 


a ___$$$ 


observe this dose directly. We can only observe whether there 
is or is not a reaction to a prescribed dose level x. The 
elementary experiments correspond to the choice of dose levels. 
It is desired to estimate u-2.330 which is the dose level at 
which only 1% of the individuals will respond. 

The information matrix (corresponding to dose level x) may 
be computed to be 


( (x-n)/o] 7 aoe 


(14) UC): = 5 ee ees Oi ae. 

o“o[(x-u)/o] [1-®[ (x-u)/o]] |[(x-v)/o G-u)"/o 
where »p and o are taken to be 84 and 8, 
reasons to be discussed later the optimal design consists of 


respectively. For 


repeating the dose levels yu + 1.5790 and # = 1.575 an enempro- 
portions 0.16 and 0.84 [8]. 

A curious property of this solution is that its implementa- 
tion requires knowing uy and o. Were they known, it would be 
unnecessary to gather the data. It is sometimes convenient 
to think of the design problem in terms of two statisticians. 
One, the designer knows » and o but can only communicate the 
design. The second, the analyzer, must analyze the data from 
the experiment, using techniques such as maximumlikelihood 
estimation which do not pre-suppose that the design itself 
tells what the value of 6 is. Usually, a more useful point 
of view is simply that if we have an approximate idea of the 
unknown parameters we can design an experiment which is ap- 
proximately optimal. This dependence of the optimal design 
on the unknown value of 6 is expressed by calling the design 
locally optimal. We shail return to this point in the next 
section on sequential estimation. 

Suppose that the experiment e had cost c. From a decision 
theoretic point of view it is natural to think of a loss 
(6-02 for estimating 8 when 6 is correct. The total ex- 


pected cost for using e, n times would be 


cn + kE(6-6)2 one + «tttyn 


Sequential Designs 107 


which is minimized for n = Gat wee giving a total expected 
cost of 2(kIt4c)}/?, 


size is different than that of the testing problem of Section 


The calculus determining optimal sample 


2 mainly because in that case we deal with error probabilities 
which decrease exponentially in n, while in estimation we deal 
with losses due to error which are of the order of magnitude 
of nt. 

If different elementary experiments have different costs, 
the above theory still applies. All that is required is to 
consider information per unit cost. Indeed if the cost it- 
self is random, one merely divides information by expected 
cost. 

Suppose that the object of the estimation problem is to 
estimate two independent functions of the unknown vector 8. 
How should one measure the value of an estimator? From a de- 
Cision theoretic point of view it seems natural to consider 
a loss function 2(6,8) for estimating 8 when 6 is the true 
value. Assuming a locally parabolic loss function which at- 
tains a minimum value when 8 = 8, one obtains 

2(8,8) = cy(6) + J c,, (0) (6;-0)(6,-6) + 0(|6-0|%) 
1,J 
where Wes, COI is non-negative definite. Then it becomes 


reasonable to require that one minimize 


E{} €,, (8) (8; -8)(@;-0)} = 164596,6, 


1,j 
where Syy = Ei [X-E(X)] [Y-E(Y)]} ts the covariance of X and Y. 
Thus we minimize 
(15) eo = er(er 4) 


where I is the information matrix. To be concerned with two 
functions of @ corresponds to dealing with a C of rank 2. In 
that case, an optimal experiment can be constructed using at 
most k + (k-1) of the pure experiments. In general if one is 
interested in r < k functions of the k parameter vector 6, a 
transformation of parameters reduces the problem to that of 
estimating the first r components of 6. The problem of min- 


108 Herman Chernoff 


nnn EEE EE EEE SERIE 





imizing any "monotone" function of the upper left hand r x r 
submatrix of re has a solution which requires the use of at 
most k+(k-1)+...+(k-r+l) pure experiments. 

A final remark concerns the possibility of coming across 
singular information matrices. The above theory still applies 
with a minor modification where the inverse of the information 
matrix is replaced by a suitable version of a pseudo-inverse 
which makes sense as a representative of the asymptotic covar- 


jance matrix. 


Sequential Experimentation for Estimation 


Suppose that it is desired to estimate the mean of a normal 
distribution with variance 1. The cost of error is k(6-0)2 
and the cost of an observation is c. The expected cost assoc- 
iated with using the mean of a sample of n observations in 
Cnet kn + and is minimized for n = Ny = Gejeyuee There is no 
advantage to be gained by using a sequential stopping rule. 

If the variance were unknown then there would be some point 
in using a sequential procedure. For asymptotically optimal 
results, it is clear that the sequential aspect of the problem 
is relatively minor. As data are accumulated, one uses them 
to estimate the variance and to determine an appropriate sample 
size. In small cost of sampling problems, the appropriate 
sample size is well known long before the end of sampling and 
an occasional up dating of the estimate of the variance will 
suffice to obtain high efficiency. Indeed one can easily con- 
struct two-sample procedures of relative efficiency one. 

When there is a choice of experiment to be made after each 
observation, the situation does not change substantially. IEE 
one applies the locally optimal design assuming that the cur- 
rent estimate is the correct value of 9, it is clear that one 
will find an asymptotically optimal sequential procedure. 

Thus, in the main part, the sequential experimentation pro- 
blem for estimation is no different than that of finding 


locally optimal designs. 


Sequential Designs 109 





It is possible to achieve a higher order of efficiency in 
estimation problems. Then the loss incurred due to ignorance 
amounts to a quantity comparable to the cost of (log ne 
observations rather than simply 0 (ng) observations. To do so 
involves consideration of the one-armed bandit problem [9] 
and we shall not concern ourselves with this refinement which 


seems out of place in the simulation context. 


Sequential Experiments for Testing 


The fixed sample optimal design for estimation can be simply 
adopted to the optimal design problem because of its locally 
optimal nature. The solution of the testing problem which 
we described in the second section of this paper does not have 
this local optimal character and does not easily convert to an 
optimal sequential design. 

We proposed a design which would be equally good under both 
hypo*heses, granted that we could not change experiments in 
midstream. Suppose that we were reasonably sure that Hy was 
correct but not sure enough to warrant cessation of sampling. 
What would be a good experiment to perform? The answer to this 
question would have the local character which is desirable in 
a good sequential experimentation scheme. 

An examination of Wald's sequential probability-ratio test 
from a Bayesian point of view suggests that if Hy): 16 (OG) = 
£, (x) is correct, a desirable experiment is one which makes 
the posterior probability of H> decrease most rapidly. If Win 
is the posterior probability of He based on the first n ob- 


servations, then 


W 5 
(16) Bp es ae ae 2 


where 


110 Herman Chernoff 


nh nh 
(17) S js toelty )/£, 0491 = ba aa 


The law of large numbers tells us that when Hy is true, 
yt rs £1 (x) 
S/he ee Eel) a Slog le Tey £1 Cae. 


Thus wo,/W approaches zero exponentially at a rate deter- 


In 
mined by 
(18) I(f,,f) = flog{f£, (x)/f£, (x) ]£, Gx) dx 


which is a Kullback-Leibler information number [15]. This 
suggests that if there is a choice of experiments, one should 
be selected to maximize I(f,,f,) if Hy is believed to be true. 
It should be selected to maximize I(f,,f,) ilge H, is believed 
ODE miter 

Consider extending the testing problem to the case of com- 
posite hypotheses. Let us first test Hy: 18 (%¢)) = £, 0x) ViSis 
H,: f(x) = £5 (x) or E(x) = £.(x). Then if fi> fo, f. have 
prior probabilities Wy> Wo, Wz the posterior probabilities 


W based on n observations are given by 


“in? Wont "Sm 
W W 
aan = —* e) lien 
ln it 
W W 
=n = — en ian 
In 1 
where 
a } yal 
i=1 cae 
and 


n if OGM) 
Sign 2 loge acy: 
cag tg 


If Hy is true Si2n/2 > I(f,,£) and Sy3y/0 > I(f,,f3)- Conse- 
quently, the posterior probability of H,, wo +wz) approaches 
zero exponentially at a rate determined by the smaller of the 
two numbers I(f,,f,), I(f,,f3).- This suggests that in the 


Sequential Designs AEE 


case of experimental choice when Hy us belliveved to be: true one 
should select e to maximize min[I(f£,,f,),1(£,,f,)]. 

We now present a more general sequential testing problem 
and an asymptotically optimal procedure. It is desired to 


test Hy: Gen. VS. H,: Dew Ww 


1 2 Ziel 
a class € of elementary experiments e, with outcome Xx, which 


n Ws = @. There is available 


has probability density f(x,®8,e). The cost of deciding wrong 
is r(@). The cost per observation is c. After each experiment 
a terminal decision can be made or a new experiment selected to 
be performed independently of the past data. 

Let 
(19) 1(6,¢,e) = flog[f(x,6,e)/f(x,¢,e)]£(x,6,e)dx 


h(8) = w,, a(@) = w, if 6ew, 


if 6e€w 


h(6) Wo» a(6) Wy 2 
Let e(6) be the randomized experiment which maximizes 
(20) inf I1(6,¢,e) 
goea(G) 

and let I(6) be the maximized value. Let a) be the nth ex- 
periment used, Xn the outcome, on the maximum-likelihood 
estimate of 6 based on the outcome of the first n experiments. 

Let on be the maximum-likelihood estimate when 8 is re- 
stricted to a(6) (the alternative hypotheses of é.): Then 


the generalized likelihood-ratio A is given by 


n = 3 = : 
(21) log A, = } tose Ky »8,se0)/£O; 8,007]. 
Procedure A consists of stopping experimentation and accept- 
ing h(6_) after the nth observation if log AL SS Slog Es. Owes 
wise e(ntl) = e(6) is selected to maximize 
inf, 1(6,o,€)- 
pea(6) 

The sample size at the end of experimentation is labeled N. 

It can be shown [7] that if the true value of the parameter 


Seo), sehie expected cost 


Herman Chernoff 





(22) R(0) = cE, (N) oeri((G)) 1 (00) mee 


=C liogac agen 
1(6) 


where e€(@) is the error probability and is of the order of 


magnitude of c. Moreover, this method is asymptotically 


optimal in the sense that for any procedure where R(6) = 
Oley loge) borealis ce then 


R(6) 


—_ + OW malar. 
ST CmiC om 


lim inf 
(@ => (0) 


> 1/1(6) 


The potential gain to be derived from using randomized 
experiments can be clearly seen by reference to the following 
Situation, keeping in mind that if e is a randomized mixture 
pe, 

Sup - 


of ey and e, with probabilities p and 1-p (we write e = 
© (1-p)e,) then 1(0,9,e) = pI(6,$,e,) + (1-p)1(0,¢,e,). 


pose H,: @ = 0, is tested vs. H> 8 = 85. or Onai8 and 8, is 


5, 
true, and 1(6,,¢,e) is given by the following table. 
i 


1 


Table 2 
T(6,,¢,e) 


The pure strategy which maximizes min 1(6, ,¢,¢) is e, for 
which the minimum is 2. If the randomized experiment combin- 
ing ey and ez each with probability 1/2 is permitted, we 


achieve a value of 3. 


Extensions and Related Results in Estimation 


The quoted results on locally optimal designs for estimation 
and optimal sequential designs for testing hypothesis have been 


generalized or are closely related to other results. In this 


Sequential Designs 1S 


section we shall briefly describe a few of those relevant to 
estimation. 
Elfving [10] introduced the following regression model. 


Suppose that one may observe 


(25) Ma - ByX44 it BoXo3 ctamercaciag ty By XG Su een 


where the u; are normally distributed residuals with mean 0 
2 (ts) ie 

and common variance o , and the vectors x = (yp oXggreee> 

X14) are restricted to a compact set S. It is desired to 


estimate 
ee a ae 2A 
by a linear unbiased estimate based on the n observations, 


NBN aN 


enn n' 


(25) 


o> 
I 
HOTS 
Q 


=1 
(1) 


the following elegant geometric form. Consider S* the convex 


How should one allocate the n vectors x The solution takes 
set generated by the points x of S and their negatives -x. 
Extend a ray from the origin through the vector a = (a) .az,-es 
ay). The point u where the ray penetrates S* represents the 
solution. 6 Ww as an lellement x of Sor @ts negative, an op- 
timal design consists of selecting x n times. Otherwise u is 

a convex combination of at most k points of the generating set, 
Helse lene) ore where eae = 1. Then an optimal design 


1=1 zt 
consists of the experiments ee eee) 


in propor- 
tions ley slegls---slexl- Moreover, the variance of the esti- 
mate is given by 


2 
(26) 0% = S(a/u)? 


where a/u is the scalar ratio of the two vectors. 

Suppose now that one is restricted to use only experiments 
ee) 2) Then replacing S by this set of r points, 
the above results also show how well we can do in this subop- 
timal situation. 


114 Herman Chernoff 





The information matrix of the 8's resulting from the use 
of x (assuming that the irrelevant parameter o is known) is 


easily computed to be 


G27) Go) = IIx5 x5 ll. 


Thus since the results of the third section of this paper de- 
pend only on information matrices, any problem, whether it is 
a regression problem or not, whose elementary experiments have 
information matrices of rank 1 can be solved by using Elfving's 
solution. Indeed, this is how the solution of our second ex- 
ample was derived. Elementary experiments have information 
matrices of rank 1, if the distribution of the outcome depends 
on a single function of the unknown parameters. 

Elfving's results were generalized and applied by Kiefer 
and Wolfowitz [13,14] who observed that in the multiparametric 
case the criterion which minimized variance of the estimates, 


1.€., minimizes 
V = det 2 


also provides that design for which the maximum variance of “the 
estimated regression is minimized. More precisely, for xeS, 
Hee 
a Ba z 2s Ie: BX 

where See are estimates based on the designed experiment. 
The variance of ¥ depends on x and the maximum value of this 
variance for xeS is minimized by the design which minimizes 
the generalized variance. 

If it is desired to extrapolate, i.e., to estimate the re- 
gression for points xfS, this result no longer applies and the 
meaningfulness of the generalized variance criterion is open 


to serious question. 


Relevance to Simulation 


The theories presented up to now are highly parametric in 
that their application requires knowledge of the probability 


Sequential Designs 115 


distribution of the data as a function of the unknown param- 
eters. It is the nature of most simulation and Monte Carlo 
problems that this knowledge is not available. However, some 
relevance still exists. 

Perhaps the main useful idea derived from the theory is that 
one can do better by combining several experiments rather than 
concentrating on one as is common in simulation. Motivated by 
some ideas in neutron transport problems where several quanti- 
ties are estimated (reflection, absorption, transmission), I 
constructed the following simple example for analysis. 

Consider a game in which a coin is tossed. If it falls 
Heads (H), the player wins 1 and the game ends. If it falls 
Tails (T) it is tossed once more and TH leads to termination 
with no payment. The result TT leads to a final toss with 
payment of 4 for TTT and a loss of 8 for TTH. The probability 
Ofea head on va Single toss 2s p = <5. It as desired to compute 
the value or expected gain of the game. 

We assume that an experimenter is incapable of deriving or 
computing the value of the game, 6 = p+4(1-p)°-8(1-p) 2p = -.096. 
He can simulate the game. Alternatively he can simulate a mod- 
ified game where p = .3 is replaced by x, according to the prin- 
ciples of importance sampling. Suppose he applies level x 
and studies the two variables gain and number of tosses per 
game. Thus he observes two variables Yy S Y¥, OQ) and Y, - Y,@9) 
with distributions described in Table 3. The coefficients 


Table 3 
Simulation at level x 


Outcome H TH Tay TTH 
Probability xe (1-x)x (1-x)° (1-x) 2x 
Corresponding 


Values of 


¥, 0X) ay 0 4a, -8a 

Y¥,09 ay 2a, 5, 3a, : 
=p af@sp) = Gizp) = tlSp)ap 

ay ee? 22 Giaae 5 (1-x) 3’ a4 (1-x) 2x 


116 Herman Chernoff 


A,» ar, az, and ay represent the weights used to translate the 
results of the-modified game to be relevant to the original 


game. It is easy to see that 
. 3 
E(Y,) = @ = p+4(1-p)>-8(1-p) “p 
E(Y,) = n* = p+2(1-p)p+3(1-p)* = 3-3p+p’. 


The covariance matrix 2(x) = Ito; sil OL we = (YY) is directly 
computable as a function of x. 

Assuming that the cost of the simulation is proportional to 
the average number of tosses per trial at level x, 3-3x+x", 


the appropriate measure of variance of 6 adjusted for cost is 


V(x) = [2 Sona 


V(.3) = 32.74. The best 
32.73. LE <ewor Mevetsior 
X, X} and X, are used so that the proportion of total cost 


For the unadjusted game V = V(p) 


single level x gives V = V(.295) 


allocated to xX} is X we have 


i rig 
ve = [il +C-A) 05) 


where 


1 it 


oe tyes 2, - 
I; = [3 3x; +x] 


[=(x;)] 
is analogous to the information matrix per unit toss for level 
Xi. Welsellect ny, xX} and x, to minimize and obtain i = .83, 

x1 = .40, X» = .17 for a value of V* = 25.31 using information 
matrices 


a 1031 == 0/216 De eee 2 eo 
na ee 
= 7A) gBSt 2 OO Ar. 1102 
The appropriate estimate of 8 will have the form 
aY,(x,)+(1-a)¥,(x,) + BLY, (x) -Y,(x,)] 
where i, = (x4) and E(x) can be used to determine appropriate 
values of o and 8B. 
Admittedly the efficiency of the optimal procedure is not 
terribly striking but the example was constructed under strin- 


genttime and simplicity restrictions and it does indicate that 


combining two experiments can raise efficiency by some factor. 


Sequential Designs 117 


How would one expect to do the analysis required to optimize 
in more complicated problems? It should be pointed out that 
a relatively small number of trials may be used to obtain 
estimates of the covariance matrices and the information 
matrices for a particular experiment. Thus, what we did here 
analytically could be estimated from the first stage of a large 
simulation. 

The usefulness of two experimental levels in our third ex- 
anpleminess partys Imethe possibilty ot sellecting levellss x for 
which Yq and Y, are highly positively correlated and other 
levels for which they are highly negatively correlated. If 
the quantities being estimated correspond to the probabilities 
of non-overlapping events, one tends to have only negative cor- 
GeviatstOnS ma nGmcheGem cmulultthemSICOpe TOGmtMeNUSemOts isevemalll 
Wewelsr 

The potential range of applications of sequential testing 
theory seems rather small. There have been classical problems 
in science where one desired to estimate a quantity which was 
known to be an integer. For such problems the testing theory 
may be appropriate. Here again, the idea of using several ex- 


periments seems potentially valuable. 


Other Approaches 


Experimental design had two major developments. One con- 
cerned sample surveys where ideas such as stratified sampling 
and cluster sampling were developed. The other involving the 
use of latin squares, factorial experiments, partially balanced 
designs, etc., was mainly concerned with computational feasi- 
bility and attempts to avoid the consequences of hidden effects 
which sometimes are not even explicitly indicated in the model. 
Efficiency considerations appeared but in a somewhat secondary 
role. In both of these developments randomization plays an 
important role, mainly as a means of avoiding the effects of 


hidden regularities. 


118 Herman Chernoff 





Sequential experimentation arose in several contexts. The 
Response Surface design approach seems to be a valuable one for 
simulation applications and is considered in detail in some of 
the other papers at this symposium. 

The Robbins-Monro stochastic approximation method, its ex- 
tensions and the Up and Down method are worthy of special 
note. 

The sequential design theory for testing hypotheses which 
we described has been extended to k action problems by Bessler 
[1,2] and further generalized by Kiefer and Sacks [12]. Box 
and Hill developed an alternative technique [3], which Meeter, 
Pirie and Blot [16] pointed out was suboptimal asymptotically 
but seems to do better than the Chernoff-Bessler method for 
simulation study examples involving moderate sample sizes. 
Meeter, Pirie and Blot proposed and studied some alternative 
procedures. I wish to propose the following modification of 
the Box-Hill and Chernoff-Bessler methods conjectured to be 
asymptotically optimal and effective for moderate sample sizes. 
If experimentation is to continue after the nth observation 
select the next experiment e to maximize 


we 


] 
i nts aco 3? 


1(8; 85 ,e)/ 


y We 
6jea(6;) Ie 
where We. is the posterior probability of 65 and a(6;) Ws" the 
set of all possible states of nature for which the best action 
is different than that appropriate for O.. 

Some problems in design for estimation have been attacked 
where the results of successive experiments are not assumed to 
be independent. A problem which seems to be important in the 
context of classification problems also falls in this category. 
Here we may observe a multivariate random variable X which is 
useful to discriminate between two populations. However, it 
is costly to observe any component of X. Which component shoul 
we look at first? Having seen that, which component shall we 
look at next? When should we stop and make a terminal decision’ 


Here we lose the elements of, (i) independence of successive 


Sequential Designs TGS 


trials and (ii) the ability to repeat an experiment. The non- 


repeatability aspects of this problem have been treated by 
BAlce vernon | 1sl]| = 


HOR 


altel 


ie 


Bibliography 


. Bessler, S. "Theory and Application of the Sequential 


Design of Experiments, k-Actions and Infinitely Many 
Experiments," Part I - Theory, Applied Mathematical and 
Statistics Laboratories, Stanford University Technical 
Report No. 55, (1960). 


. Bessler, S. “Theory and Applications of the Sequential 


Design of Experiments, k-Actions and Infinitely Many 
Experiments," Part II - Applications, Applied Mathemat- 
ical and Statistics Laboratories, Stanford University 
Technical Report No. 56, (1960). 


. Box, G.E.P., and Hill, W.J. "Discrimination Among Mech- 


anustiuce Modells," Technometrics, IX (1967)5 57-71. 


. Chernoff, Herman. "A Measure of Asymptotic Efficiency for 


Tests of a Hypothesis Based on the Sum of Observations," 
Annals of Mathematical Statistics, XXIII (1952), 493-507. 


Chernoff, Herman. "Locally Optimal Designs for Estimating 
Parameters,'' Annals of Mathematical Statistics, XXIV 
(I9Q5S)) 5 SSO=07F 


Chernott; Herman. "“Warge Sample’ Theory, Parametric Case," 
Annals of Mathematical Statistics, XXVII (1956), 1-22. 


Chernoff, Herman. "Sequential Design of Experiments," 
Annals of Mathematical Statistics, XXX (1959), 755-770. 


Chernoff, Herman. "Optimal Design of Experiments," Pro- 


ceedings of the Eighth Conference on the Design of Ex- 
periments in Army Research Development and Testing ARO-D 
Report 63-2, (1963), SOS onion 

Chernoff, Herman. "Optimal Stochastic Control," Sankhya, 
(to be published). 


Elfving, G. “Optimal Allocation in Linear Regression 
Theory," Annals of Mathematical Statistics, XXIII (1952), 
255-262. 


Elfving, G. "Selection of Nonrepeatable Observations for 
Estimation," Proceedings of the Third Berkeley Symposium, 
Te GUSS 5) ewe S75 


Kiefer, J., and Sacks, J. "Asymptotically Optimal Sequen- 
tial Inference and Design," Annals of Mathematical Stat- 
IStUES, WOO (IGS) , W05=7 50K 


13% 


14. 


Die 


oN 


Herman Chernoff 





Kiefer, J., and Wolfowitz, J. "Optimal Designs an Regres- 
sion Problems," Annals of Mathematical Statistics, XXX 
(1959), 271 =29)4". 


Kiefer, J., and Wolfowitz, J. ''The Equivalence of Two Ex- 
tremum Problems,'' Canadian Journal of Mathematics, XII 
(19160) > 365-306). 


Kullback, S., and Leibler, R.A. "On Information and Suf- 
ficiency," Annals of Mathematical Statistics, XXII (1951), 
79-86. 


Meeter, D., Pirie, W., and Blot, W. ''A Comparison of Two 


Model-Discrimination Criteria,'' Florida State Universit 
Report, Tallahassee, Florida, (1968). 


Data Analysis 





4 


a 
yon olpd” 


Regression Analysis and 
Analysis of Variance 


Harry Smith, University of North Carolina 


Introduction 


The tendency to consider applied statistics as a subject 
made up of many distinct and separate techniques has led to 
a great deal of misunderstanding. One of the foremost ex- 
amples of this is the distinction made between the Analysis 
of Variance and Regression Analysis in most textbooks. It is 
true that the impetus behind the creation of ANOVA came from 
the design of experiments in which the elements were treat- 
ments and blocks, both of which were considered qualitative 
type factors. In recent years, the development of response 
surface methodology has rekindled the understanding that the 
two subjects are merely both part of the topic, "Generalized 
Linear Models."" In fact, it has been shown that the ANOVA 
subject matter is just a specific subset of regression anal- 
ysis. 

The purpose of this paper is to consider the relationship 
between ANOVA and regression analysis for both qualitative 
and quantitative factors, to indicate how to put the usual 
ANOVA material into a form for analysis by regression methods, 
to point out some of the difficulties that arise in doing this. 


General Regression Analysis Results 


The following general regression results are well-known, 
and are presented here so that the analysis of variance results 


can be shown in this context: 


OM Y= "Kpre 


~ 


124 Harry Smith 


where Y is an (n x 1) vector of observations 
X is an (n.x.r) design matrix of full yankeree 
B is an (r x 1) vector of unknown parameters 
€ isan (i x Ln vectormrotmernnonsi 
Under the assumption E(¢) = 0, the least squares estimate 
of Beas 
(2) b = (X'x) lxyy. 


If one desires to test hypotheses of the form Hg: Cerys 
where © is a (c x ©) matrix and y is (c¢ x 1)) Vectonmmancmems 


a rank c, the following additional assumptions are required: 
EGS) oe and € N(O,I07). 


In order to test the hypothesis indicated by the C matrix, 
the sum squares due to this hypothesis is 


(3) SSHy = (CB-YTICCRNO CI NCCE), 
and the corresponding sum of squares due to error is 


(4) SSE = KSy IRGC = 


XEON 
Thus, the Hypothesis Ho: CB = y versus the alternative 
H-: G8 4 y as tested usamg the F-sitatisitve 


1 

SSH)/c 
(5) F(c,n-r) = ————. 
SSE/n-r 


A more general result is needed for experimental designs. 
Given the model: 
Re Ow ae Or eae ee 

(nx1) (GoBere) (Gerxcil)) (nx1) 
where the rank of X = s < r<n. Thus, the matrix X Sa OieeOts 
full rank. By use of the transformation 
(6) 6 =, abel at 

Gack) (Gas) ((sxeil)) 


where G is a matrix which consists of the restrictions on the 
(rxs) 
regression coefficients in the model. 


Regression Analysis and Analysis of Variance 125 





The least squares estimates of the I's are written 
(7) T= (G'X"XG) "GY 
The sum of squares due to the hypothesis Ho : CBy= sy as) equa — 


alent to the SSH) for Hg: CGr =a and is 
1 


Aa 


(8) SSHy = (CGT-y)"[ESCG'x"xG) “TG'C1]”*(CGE-y). 


Correspondingly, for this case, 


i 


(9) Sees Le. Y'XG(G'X'XG) © G'X'Y 


Further, if more than one hypothesis is to be tested, say 


Hye (C7Gl = y, and Bo: CGr = i 
the tests of two hypotheses are independent if and only if 
(20) CG(G"X"XG) “*G'Cy = 0. 


An Experimental Design and ANOVA 


A typical experiment compares two catalysts and three 
Meagents In ducatalyst plant. “This 1s a 3 x Z cross classi - 
fied design where each of the six sets of conditions were re- 


plicated twice. The data are shown as follows: 





The mathematical model written in the usual manner is 


(Glas) age ame aa: Ge = (RC), + aan 


j 
“kits 
the followi 


3 
(12) ) 
i=1 
2 
(13) } 
j=l 
Cla =) t) 
1) 


The Analysi 


The estimat 


follows: 
ll 
RD 
R 

(14) : 

(Note: The 


imposed on 


Harry Smith 


= population mean 
= effect of the ith reagent (i=1,2,3) 
= effect of the jth catalyst (j=1,2) 


) = random variable %v N(0,07) (k=1,2) 


ng constaints are made: 


Ra 
Co = 

J 

(RC); = 0. 


s of Variance results are 


ANOVA 
Source of 
Vialelartsonendert Si.iSye m.S. F 
Total 11 208 
Reagents 2 104 Bi LS 
Catalysts 1 48 48 12 
Rac Z 32 16 4 
Residual 6 24 4 


es of the various effects in the model are as 


=) SG ee RG =e 
sig lin (Cf ee PERO) see tan 
= +4 (RC) 9) = -2 
(RE) 53 > *2 
(RC) 53 = #2 
COA 


restrictions imposed on the parameters are also 


the estimates, i.e., 2 R; = 21C; = 2% E (RC). = = 0%) 
zl j J ay 1) 


Regression Analysis and Analysis of Variance 27, 





To test the hypothesis Hg: Ry = R, = R, = 0 versus the alter- 
native hypothesis that at least two reagent effects are dif- 
ferent, the F-test is used; namely, 


SSH,/2 
F(2,6) = ——— = 77 = 13.00 


Depending on a choice of a = the probability of the Type I 
error, one either accepts or rejects the null hypothesis. 
Corresponding tests can be made for catalysts and the R x C 


interaction. 


ANNOVA Results Using Previous Regression Results 


First, model (11) must be rewritten in terms comparable to 
the regression model (1). This is accomplished through the 
use of dummy variables, X,, which take on only the values of 


0 or 1, in the following way. The model for the above design 


SHS 
eeeunoi di Ooo on | ssn PANG PG es 
(15) Pia ie ame Gas tn ea) Poe aaa Poa aa Sa 
eras dc 5 oa San oa 
where Xoi = iL se@ie gull at 


X s i if reagent #1 is used 
ea 0 otherwise 


x s ee if reagent #2 is used 
Deel 0 otherwise 

Xx a a if reagent #3 is used 
Sul 0 otherwise 


X = Gt iecalcallays tap ESMmUSeG 
4i 0 otherwise 


X = (a lead Si tm ielS USC. 
Bil 0 otherwise 


and By) = true intercept (the population mean in this case) 
en effect of Reagent #1 
Bore effect of Reagent #2 


128 Harry Smith 


= effect of Reagent #3 


B 

8, = effect of Catalyst #1 

B. = effect of Catalyst #2 

Big = effect of interaction effect (RC) 44 

Bis = effect of interaction effect (RC) 44 

Ba4 = effect of interaction effect (RC) 54 

Bos = effect of interaction effect (RC) 45 

Bay = effect of interaction effect (RC) 24 

Bee = effect of interaction effect (RC) 35 

Thus, using these definitions, the model, E(Y) = XB, is 
shown Deore X g 

ie be) etd ao 

eT | 0 
Y3427 le RTE E BOM aS ee OR 0 0 0 0 0 By 
Y3217 alah al OO al il 0 0 0 0 By 
Y3227 WW 0 oO @ @O ww 1 0 0 0 8B. 
You1" 6h LO Oe ee Oe 0 1 0 0 0 By 
Y5127 AV @ a @ tk @ © 0 a 0 0 | Be 
Yo217 LAN a OAs OO wo 0 0 IL 0 0 Big 
Y522> Wit Ov at © OM 2 0 0 i 0 0 Bis 
Youi7 US| OW a i oO © 0 0 0 a 0 Bog 
Y3427 C159 fe ee el ©) Le es © 0 0 0 it 0 Bos 
Y3047 STNG) fe Lee fe) eee eel) 0 0 0 0 7 Bs4| 
Yz5,5 12]/|1 0 0 1:0 10 0 ° 0 0) Oe a 


Inspection of the above X matrix reveals that it is of rank 
s = 6 and not of rank r = 12. Thus, the 6s in the panamecer 
vector need to be constrained. This can be done many ways, 
but the transformation indicated by (6) will be used to il- 
lustrate how the restraints required in the ANOVA calculations 


can be utilized to define G and [. In this case, 


Regression Analysis and Analysis of Variance 129 





Eps oie 
Bo (Oe mel Oan 50 i | 
By 0 it 0 0 0 0 By 
Bo 0 0 it 0 0 0 B, 
B. Q ei =il 0 0 0 By 
Ba 0 0 0 i 0 0 Big 
Bb. = 10 0 @ =a 0 0 [824 
Big 0 0 0 0 il 0 
Bis 0 0 0 @ il 0 
Bo4 0 0 0 0 0 il 
Bos 0 0 0 0 0 = 
Bay 0 0 0 OR dy 
B56 0 0 0 0 il 1 
where the constraints are as follows: 

Bz = -B,-8, 

8. a TOy 

Pape” 14 

Pos = lain 


Pees electra 

cece we Viuec 45 ag’ oA 
These constraints govern the pattern of the G matrix. Ohta et zr 
ing this transformation and the formulas (7,8,9) the analysis 


of variance results and treatment effects (14) can be obtained. 
Thus, 


r = | = gp = 
=e eet 
(16) ot Dy he 
o) by) 30H) 
| ee Ona 
ea oa SRO 


Test of Hypothesis Example 


If one were to test the hypothesis that Reagents effects 


130 Harry Smith 


were zero, OT 


Ho? Ry = Ry - Ry = 0 
Hy: at least 2 are different, 


the C matrix could be defined as follows: 
CC =s 0) 2 0) 0 0) 0) 0" 0 0 Ora CmeG 
(2x12) |g 9 do” 0 0 © 0° 0 So MOM 
Using this matrix and the result (8) the sum of squares for 


this hypothesis can be obtained. It will be the same as indi- 
cated in the ANOVA table. 


General Comments 


The advent of computer programs for generalized linear 
models has provided the impetus for analyzing experiments by 
the approach shown above. However, it is not at all clear 
that this is an optimum procedure for all experimental designs. 
The generalized approach has the following disadvantages: 


1. The model has many more terms, one for each degree of 
freedom for estimating an unknown parameter. Thus, the size 
of the linear model becomes very large even for seemingly small 
experiments. 

2. The experiment has to be indicated through an incidence 
or X Matrix. While this is not diffreult, it asenecessanvwece 
exercise extreme care in writing it. 

3. Usually, all incidence matrices are singular and require 
reparameterization either directly by known constraints, or by 
means of the transformation GI as shown in this paper. 

4. The analysis through the generalized regression model 
requires an excellent, workable computer routine. While this 
may not seem a deterrent, it can be a real stumbling block for 
the generalized approach. 


However, the generalized approach has some definite ad- 


Regression Analysis and Analysis of Variance eS Al 


ee 


vantages: 


1. The same methodology is useful for all types of experi- 
mental data: qualitative, quantitative, balanced, unbalanced, 
etc. 

2. The extension to models in which the assumption of common 
Vanvance Ls) not valad. a.e., ECeve)) = Vo? where V is a known, 
symmetric non-singular matrix, is Stnrateht-torward, Fox x= 
ample, the estimates are obtained by using 


lerxry ly, 


TP = (G*X'V "xG) 
3. The extension to multivariate analysis, i.e., when the 

vector Y is replaced by a matrix Y (n x p), is relatively 

straight-forward. This is not meant to imply that it is an 


easy-to-understand analytical method. 


Conclusions 


Specialized analysis of variance programs should be Ube 
ized for the smaller balanced problem; i.e., when the number 
of parameters, p < 16, and the number of observations, n < 30. 

For the larger balanced problem and for the unbalanced pro- 
blem, the generalized linear model approach is optimal. The 
computer programs for the generalized linear model are excedl- 
lent and well-documented [1]. 


Bibliography 


1. Grizzle, J., and Starmer, F. ‘A Computer Program For Anal- 
ysis of Data by General Linear Models," Institute of Stat- 
istics Mimeo Series, University of North Carolina, Chapel 
asl NeGs 


2. Rao, C.R. Linear Statistical Inference and Its Applica- 
tions. New York: John Wiley 4 Sons, 1965. 


Selection and Ranking 
Procedures 


S. S. Gupta and S. Panchapakesan, 
Purdue University 


Introduction and Summary 





Often in practice one encounters k populations (categories, 
varieties, processes, candidates, etc.) and associated with 
the ith population, T+ 5 is an observable random variable whose 
distribution depends upon an unknown parameter 6.5 T=  2ieteroh ts 
k, The classical tests of homogeneity, i.e., testing the 
hypothesis of equality of parameters, did not answer the ques- 
tion of what next, if the hypothesis was rejected. The re- 
sult of the attempts of the practical decision-maker to form- 
ulate the problem in a more realistic and meaningful way is 
the selection and ranking procedures, otherwise known as 
multiple decision problems. Bahadur [3] was one of the earl- 
iest authors to contribute to the theory of k sample problems. 
Many authors have since contributed to various aspects and 
modifications of the basic problem. References could be made 
to Bechhofer [8], Gupta [16], Gupta and Sobel [23], Lehmann 
[35], and Gupta [18]. Naylor, Wertz and Wonnacott [36] have 
discussed the role of multiple ranking and comparison tech- 
niques in the analysis of data from computer simulation exper- 
iments. 

Generally, problems of selection and ranking have been form- 
ulated in the following two types: (i) selecting a fixed num- 
ber t of "best" populations using an indifference zone ap- 
proach which is due to Bechhofer [8], (ii) selecting a subset 
of random size of the k populations, a formulation due to 
Gupta [16]. Gupta [18] discusses subset selection procedures 
for location and scale parameters and also makes a brief re- 


view of work by other authors in the area of selection and 


Selection and Ranking Procedures 133 


ranking problems and other related problems. In the present 
paper, our attention is confined to subsequent developments 
in the problems of selection and ranking using subset selec- 
tion formulation. 

The general nature of investigations in these problems is 
to assume k populations with distribution functions F(x,6;), 
I= Zeca yk, where 8. is a vector of population parameters. 
The populations are ranked in terms of scalar functions ag 
¥(85), = Ore eieieis cre ccihy 8; may be completely unknown or 
partly unknown, the functional form of ¥(6.), however, is 
known. The ordered ¥; are denoted by rea < YF 2] SSr ate eae 
Of course, the correct pairing of the unordered and ordered 
Y's is not known. To be precise, let us suppose that the 
larger the value of ¥, the better the population. 

In the indifference zone approach, we wish to select the 
t(t<k) best populations, i.e., the populations with t largest 
Y values. We define a procedure which will select exactly t 
populations so as to guarantee with probability P* that the 
selected populations are the t best ones whenever the dis- 
tance d (to be suitably defined in terms of ¥'s) between the 
set of t best populations and that of the remaining k-t pop- 
ulations either exceeds or equals a specified amount 6. In 
our set-up, d is the distance between the (k-t+l)th best and 
the (k-t)th best populations. So in this formulation §(>0) 
and P* have to be specified in advance. Of course, Pt> (0) 
for any meaningful problem because, otherwise, we can always 
Sselleee iE of ethe kk populaitaions vat aandoml. 

In the subset selection formulation, we wish to define a 
procedure which will select a subset, whose size is a random 
variable taking on values t through k. The procedure should 
be such that the subset size is small but large enough to 
guarantee a minimum probability P* of including in the select- 
ed subset the t best populations regardless of the configura- 
tion of the population parameters. In this formulation, only 


BCG) <9P*)<51) 25 specified in’ advance. 


134 S.S. Gupta and S. Panchapakesan 


—_— EE RLS LL 


Let CS stand for a 'correct selection', which means sel- 
ection of the t best populations or selection of a subset 
which includes the t best populations depending upon the type 
of formulation. Let 2 be the space of ¥ = (Yi s¥areee oy) and 
Q, be the subset of 2 where d > 6. Then in the indifference 


6 
zone approach we want to define a procedure R such that 


(1) inf P{CS|R} = P* 

os 
and in the subset selection formulation, we define R such that 
(2) int P{CS|Rie= P*, 

2 


In all these investigations, after a rule R has been proposed, 
more or less heuristically, its properties are studied. Re- 
cently, Sobel [39] has discussed a formulation which combines 
the indifference zone approach and the subset selection ap- 
proach. His goal is to select a subset of at Mealst esiiziegs 
so as to include any one of the t best populations with a min- 
imum probability P* under an indifference zone set-up. A 
usual modification of the goal in all these problems is to 
consider selection of populations better than a standard or 
control population (see Gupta and Sobel [22], Dunnett [13]). 
In the following section we describe a class of selection 
procedures and discuss certain properties of these procedures. 
The succeeding sections summarize some specific procedures for 
various selection problems mainly considered by Barlow and 
Gupta [4], Gupta and Nagel [20], Gupta and Studden [26], Gupta 
and Panchapakesan [21], Gnanadesikan and Gupta [15] and Barron 
[7]. Brief tables are given for the constants in some of the 


procedures. 


A Class of Selection and Ranking Procedures 


Let 1. be the population with an observable random variable 
x having density function fj. (x), 4d, Uoobanoyrls ele ooo dlc 


Let di sh be the ordered Meh Se Let 


ay Spay Soe * Ek] 


Selection and Ranking Procedures SiS 


x= (X] sXo0+++5X,) be an observation on X' = (Xp oXa5---sX,)- 
Based on the observation vector x, we are interested in sel- 
ecting a non-empty subset of the k populations such that the 
probability a5 at least P* that the best population, ive., 
the one associated with i k1? is included in the subset. Let 
hy (x), be[0,~) (or be[1,~)) be a class of functions such that 


for every X on the common support of Fy, 


(a) hy Gx) ex 
(b) hy (x) = x (or h, (x) = x) 
(eo) lem hy Go) = 2 


oo 
(d) hy Go) is continuous and monotone-increasing in b. 
Then the class C of procedures Rhy is defined as follows: 


Rhy: select 1; ett 


(3) heel(Ges) Ss naa 
pane Ge tae 


The above procedure selects a non-empty subset of random size 
in view of (a). Denoting by X,.,, the random variable assoc- 
iated with MTG]? we have P{CS|Rhy} = P{hy (X(,y) es XG? 
Hols 2se 0s Koay 


k-1 ) 
ee ie Ef et Onin) a 


If we now assume that Fy2Q) < Fy, @) 1E@ Ie Ay > Ay and for 
alloc, sehen 


(5) int Ptas|R,} = inf fEX” (hy @))£, dx, 


where 2 is the space of TS (Ay Agee eesAy)- 
Now we discuss the infimum over \ of fEX + hy 0) £, (x) ax = 
A(X), say. We are interested in the cases where the density 


£, (x) is of any one of the following forms: 


(i) £, (x) = TE (x>\)5) tse.) Ais) a) location parameter . 
(Gare Ge) = Paeoe Xe OR ne. essa Scales parameter. 
z a ee P 


co 


Cid) = Gis ye NCS Wee Ode where, A(Cx),, J=0ijLs.. 
j=0 y J 


136 S.S. Gupta and S. Panchapakesan 


ee eee oo 


is a sequence of density functions on [0,°) and W(A,j) are 
non-negative weight functions such that Bit WQA,j)) = Le By 
a lemma of Gupta [19], it can be seen eaSily that in cases 
Gi)iandin Guise wemcane hy (x) = xtb, b>0 and hy Oc) = bx, 
b>1l respectively, the infimum of A(A) takes place reope AN wet (0 
and 1 respectively. 

We have some important distributions which come under case 
(iii), namely, the non-central ene the non-central F and the 
distribution of the multiple correlation coefficient. In 
these cases, W(A,j) is either a poisson weight or a negative 
binomial one. 

We have assumed in the course of our discussion that F (x) 
is stochastically increasing. In case (at itt), £, (x) is totally 
positive of order 2 (TP 5) and hence F, (x) is stochastically 
increasing if W(A,j) and eet TP,- This result follows 
from using the basic composition formula of Polya and Szeg6 
(see Karlin [29, p. 17] and Gupta and Panchapakesan [21]). 

As regards the infimum of A(A) over i, Gupta and Studden 
[26] have obtained a sufficient condition for A(A) to be mon- 
otonically increasing (non-decreasing) in i, when W(A,j) = 


AnI 
oar, A > 0. Gupta and Panchapakesan [21] have obtained a 


qe is : 
The two conditions are also sufficient for B(A) = ee - 


similar condition when W(\,j) = 


ime e ieeNee to be monotonically increasing (non-decreas- 
ing) in X, this integral being the one for which we need the 
infimum over \ for the problem of selecting a subset includ- 
ing the population associated with A 1° We will list below 
for convenience the distributions which belong to case (iii) 
and which have been discussed by Gupta and Studden [26] and 
Gupta and Panchapakesan [21]. The sufficient conditions for 
A(A) and B(A) to be monotonically increasing have been veri- 
fied by Gupta and Studden [26] in the case of first two den- 
sities and by Gupta and Panchapakesan [21] for the last two 
densities. 


Having proposed the procedure Rh,, we wish to study its 


Selection and Ranking Procedures ey 














85) W(A,j) Remarks 
eens e J x20, u>0, A>0 non-central ve with 
T(utg) np non-centrality parameter 2) and 
degrees of freedom 2,. 
: j- -A j 
P(utv+j) xt"I L e “J x20, u>0, v>0. 8; (x) is the density 
T'(v) T(ut+j) (1+x) ¥tY*9 ais of a constant (depending on j) times 


a central F with d.f. 2u+2j and 2v. 
T'(q+j+m) ett5 Vey hot el nay O<x<1, ee q, m>0; £, (x) is the 
T(q+j)T(m a density of R” (square of multiple 
correlation coefficient) in the 
'conditional' case. 


j 
P(q+j+m) q+j-1;,_.,m-1 r( eG pes a O<x<1, 0<A<1, q, m>0; f,(x) is the 
T(q*3)T (m) x (1=x) v q+m) 3! - Lie B ays 


density of R“” in the 'unconditional' 





case. 





properties. One of the desirable properties of the procedure 
is what is known as unbiasedness or monotonicity property, 
which means that the larger the X value of a population the 
greater the probability of its being included in the selected 
subset. Gupta [19] has shown that Rhy has this property. 

In all these procedures we are interested in a suitable 
criterion for the performance of a procedure. One that natu- 
rally suggests is E(S), the expected size of the subset sel- 
ected. It is also meaningful to consider E(S)), where Sy is 
the number of populations included in the subset other than 
the best. Hence we are interested in sup E(S) or sup E(S,) 
prrqoerr ody 
If we denote by Pi the probability that the population 


over all possible configurations of A 


associated with Meal is selected, then 


(6) E(S,) = PL + P, ofa teva Pry and 

@a) E(S) = E(S,) + Py. 

For the procedure Rhy defined in the beginning of this section 
-k 

oe is Pigy nC Milena days atace: nok. 


138 S.S. Gupta and S. Panchapakesan 
ee OE EE LE ee 
It follows from Lehmann [34, p. 112, ex.11] that P; increases 

in May Because of the assumption that F) ;j is sto- 
chastically increasing in Arp]? by differentiating P with 
respect to any Mj \? j#i, we can easily see that P; ivaeaeee 
in each nae j#i. Hence each Ps Seis »k-1 dstteastia in 
AtKD? which means E(S,) deeneseee a ee ‘peepee other A's 
fixed. 


Now, let us consider 


k 
Gj = it, Fa p31 400) |frpay ODAY- 
Using integration by parts, we can write 
ae 
(10) Py 5 Kt, Fa) pO} Fayay 0 |* 
k k 
_ es JFxp37 oe Fr pj) Op) £y 74) (hy O)> 
ire 
hy} (y)dy, 


where hy) is the derivative of hy) Wert. y and ehemrirse 
term with the asterisk stands for the value obtained by eval- 


uating it between proper limits. We note that the first term, 


when evaluated, will be independent of a Hence 
Ga) E(S) = a term independent of A ta 
k k [1] 
F h 
basi agp eID 
a 


(Fay py OD) £4] - bROD Fag] 
£4] ap OD ay. 


Differentiating w.r.t. 1? we find that for E(S)) tombesnon- 


r 
{1 
decreasing in Ata}? it is sufficient that 


a a 
(12) ay (Fyre 77 (hy (y))) £47570) - BRODZZ 
aera [eal [i] b aan 
CFrpyy OD Fr pg] pO) eae 
Eons tll = oyster ae 


Selection and Ranking Procedures 159 
eee ee 


iiecondition (2) is satistred. 16 the density £, (y) satis- 
fires the Condition) that. som Ay <M 


2 
(13) a (Fay, ))) £1507) - bE) ay (Fr, 07) 
£) 5 Ch, Oy) 2 0 


For the location parameter case discussed, hy) = yitab) 
b > O and EyVOpm=ekiyaw)s, Hence, sum this case, (13). is the 
same) as the scondition) that. for Yi 51> and Ay iS dos 


COS 204 1865-8) = £754,020, -45) > 0; 
which is the condition for the density f,(y) = f(y-\) to have 
a monotone likelihood ratio. 

For the scale parameter case, we have hy) = Dy, ib 1 


and F)(y) = F(X). In this case (13) is equivalent to the 


condition that for ¥4572 and Ay SAos 


(S)) £107) £22 (79) iF £172) £24074) 2 0, 
Which is the condition for the density fy, (y) to have a mono- 


tone likelihood ratio. 


We can summarize the above in the following lemma. 


Lemma 1. 


For the procedure Rhy ECS) is non-decreasing in A ay keep - 
ing Ae ey fixed, provided that the density £, (y) Sats 
fies the condition [13], which, in the cases of location and 
scale parameters, with hy) = bty (b>0) and hh) = Evans (ios 11) 
respectively, is equivalent to fi) having a monotone likeli- 
hood ratio. 


Lemma 2. 


For the procedure Rhp> Ex any 6S) is non-decreasing in }) pro- 
vided that E(S) is non-decreasing in bg where mee m) denotes 
any vector A such that iA le [zhane [tm] = < ey SS ee 
l<m<k and Ex(m) §S) = ACS ili = Sarena 


140 S.S. Gupta and S. Panchapakesan 


EEE ___ 


Proof. It can be easily seen that 
9 y 9 
(16) iB (S) = —~—— E(S) ; 
aX “A(m) feo al Neer 


[1] 


We also note that E(S) as a mathematical function in Atapete? 
dik is symmetric in all the A's. So the right hand side of 
(Glo) eas m(3E(S)/3A/ 41) Nt ead mo Hence, “Vt Ei(S) is 
non-decreasing in May? (9E(S)/3A 741) > 0 and consequently 


(8/8A)E, (ny CS) > 0. This completes the proof of Lemma 2, 


Theorem l. 


For the procedure Rhy» E(S) attains its supremum at one of 
the points of 2 where A has all components equal provided that 
the density Oe) satisfies the condition (13), which, in the 
cases of location and scale parameters, with hy) = bty (b> 
0) and hy (y) = by (b>1) respectively, is equivalent to £,(y) 


having a monotone likelihood ratio. 


Proof. The proof is straightforward by application of 
Lemma 1 and successive applications of Lemma 2. 


Remark 1. 


Since E(S,) =) ECS) = Py and Py is non-increasing in A 11? 
by Lemma 1, E(S,) is non-decreasing in NTA] provided that (13) 
is satisfied. Further, using the arguments similar to ones 


used in Lemma 2, we can show that E. (my 651) is non-decreasing 


cA ets Oty ea Se TT as k-1. So we have 
(Ly) sup E(S,) = sup E(S))- 
: = wlan 
2 ty per)" ere 


Now, using the fact that E(S,) decreases in ATK]? we can see 
that the sup E(S,) is attained at some point of 2 where id has 
all its components equal, provided that (13) holds. 


Remark 2. 


lief: Cy) = WO,j)8;0), A205; ‘then susing the procedure 


=0 


uUemsg 


Selection and Ranking Procedures 141 





Rhy with hy) = by, e bei che? condiitaont (13) istsatistied it , 


for ¥4S572 and {<i 


1—" 2? 


0 a 


Remark 3. 


The problem of selecting a subset containing the population 
with the smallest \ value can be handled in an analogous man- 
ner. 


Let us consider a rule R satisfying the equation 


(19) inf P, (CS|R) =F (CSIR) t=aR* «sand 
wean 0 
(20) Sup lE(s|R) = Ese(S|R), where 
Q x AO 
Xo = (Ag 2Age+ ++ 2A) is some vector with all components equal. 


The equations (19) and (20), together with a simple invari- 

ance condition, have been shown by Gupta and Studden [27] to 

imply that the rule R is minimax in the sense described below. 
Suppose that Xjo+++5X, are a set of observations from the 


k populations T 9 Ty and that with this set of observations 


wer select. the ak population with probability 5 (Xp o++ +X) 
The invariance or symmetry condition imposed is that if the 
ith and jth observations are interchanged, i.e., x. is ob- 
served from ue and xX from mae then we select the jth popula- 
tion with the same probability 5 (Xp >-++X)- More specifi- 
cally, we require that 


G25) POG oR 5G 8 Xian Xp) 


Fe Mari ce aXe ta) 
for all i and j. Then the rule R is minimax in the sense that 
it minimizes sup E, (S|R') over the class of rules R' satisfy- 
ing the basic P* condition and the invariance condition (21). 
In the location and scale parameter cases equations (19) 


and (20) are satisfied since both P 9 (CSR) and E 9 (S|R) are 


142 S.S. Gupta and S. Panchapakesan 


a — 


independent of \). It follows that E(S|R) < kP* £om ‘these 
two cases. But ‘in case (iii), where the density £, Ox) is a 
mixture of the densities g; (x), j = 0,1,2),...05 Sem meme dn 


Fy (x) is stochastically increasing, then 


(22) inf P, (CS|R) = inf P,(CS|R), 
Q = 29 = 
where 2 is the subset of 2 for which hyargae es =Aye However, 


in this case, the probability of a correct selection when all 
he are equal to A is not independent of » as it happens to be 
in the other two cases. Elsewhere we have referred to a pro- 
cedure for densities falling under this category, for which 


sup E, (s) = k. In this case the rule is not minimax. 
2 


For a discussion on the properties of subset selection pro- 
cedures and a comparison of them with the 'approximate' op- 
timal rule D of Seal [38], the reader is referred to Deely 
an iGuipita sez) ps 


Selection Procedures for Restricted 
Families of Probability Distributions 


In this and several following sections we discuss some 
specific procedures. As pointed out earlier, they are all 
subset-selection procedures. 

Let X; associated with Ts have a continuous distribution 
Pas ie Pee Kom ae wassumemecicin F has a unique a-quantile, 
Cai° Let Fra Co be the cdf of the population with the ith 


smallest a-quantile. Assume that 
(CG) aes > Ere T=. 12 ne o Xrile and panleliee 


(b) J a continuous distribution G such that Fr 4x6 for 


i = 1,2,...,k, where ~ denotes a partial ordering relation on 


the space of distributions. Barlow and Gupta [5] discuss pro- 
cedures for selecting the population with the largest (small- 
est) a-quantiles for distributions which are 5; ordered with 


GReSpece ctOmar 


Selection and Ranking Procedures 143 


ee 


We take a sample of size n from each of the k populations. 
hee (Me oa denote the jth order statistic from F. where j<(n+l)a 
Sifictallrs Then the rule for selecting the Wei ation with the 
largest a-quantile is R: Select population 1; iff 
(23) Ties pe oe Tor Tse) ouet ey til 
and the rule for selecting the population with the smallest 


g-quantile 1s R':s Select population TT; qTsfest= 


(Aye tes es min Tj p+ j<(ntlje < 3+ 
<< ? 


Wie em Ummm Nic (kan Dr) p ela Ol < ide — h(a P* ng) < Lane 
determined so as to satisfy the basic probability requirement 
2ae 

‘When G(x) = 1 - e ~ for x > 0, the distributions F5 & G 
constitute the class of IFRA distributions studied by Birn- 
baum, Esary and Marshall [10]. In this case, the values of 
c and d are tabulated by Barlow, Gupta and Panchapakesan [5] 
for selected values of n, k, j and P*. The two brief tables 


below give the values of c and d for n=5 and j=3. 


Table 1. Value of c for Rule (23) when 
CO) = tee =, x20 no. J-5 ES. 


2555 a0 ore OF, Ziel 
-42508 . 25464 > 18353 
ZS 22607 . 16388 
- 54243 - 20924 eelSy2 1105 
SIZ SOG .14410 





In the special case where j = 1, it should be observed that 
the values of c and d are independent of n. The distributions 
of the maximum and minimum of ratios of order statistics aris- 
ing in the above selection problem have been discussed by 
Barlow, Gupta and Panchapakesan [5]. 


144 S.S. Gupta and S. Panchapakesan 





Table 2. Value of d for Rule R' (24) 
; when G(x) = l-e*, x>0, 
n=5, j=3 [5]. 





Rizvi and Sobel [37] propose a distribution-free procedure 
Ry for the selection of the largest a-quantile which selects 


Wa tseie 
il 


(25) [eee > TI cl ome lie 
ae 1<r<k Jee 


where a is the smallest integer with l<a<j-1 for which 
inf P{CS|R,} Seb 
a eZ 


Barlow and Gupta [4] have compared R and Ry and established 
the asymptotic equivalence of their efficiencies for k=2. 
Besides this they also study the relative efficiency of R 
with respect to a selection procedure for the gamma popula- 
tions proposed by Gupta [17]. The other procedures discussed 
by Barlow and Gupta [4] comprise selection procedures for the 
median for distributions that are - ordered with respect to a 
specified G, procedures for means for distributions that are 
. with respect to G(x) = Il-e *, i.e., for the class jf euER 
distributions studied by Barlow, Marshall and Proschan Folk 


Selection Procedures for 
Multinomial Distribution 


Let Py oPo2+++ oP, be the unkpown cell-probabilities in the 
multinomial distribution with Y p.=l. Let x,,x,,.-.,X, be 
j=l 1 ine? k 


Selection and Ranking Procedures 145 


the POSSE LIS observations in the k cells of the distribu- 
tion with 2 x,=n. We are interested in subset selection pro- 
cedure for selecting the cell with the largest (smallest) cell- 
probability. Gupta and Nagel [20] have discussed this pro- 
blem. They propose in the case of the cell with largest prob- 
ability the procedure R which selects the cell with observed 


Xia alieae 
i 


(26) Kee D 


and in the case of the cell with the smallest probability the 


procedure T which selects the cell with observed x, iff 


A 


(27) x. <x 


t + 
1 — min Cc; 


where x = MAX, (OX oj eneveratX euniGlsexanee 
max ( ifs: 2 k) min 


C and D are the smallest non-negative integers for which 


= min OG oe +X) and 


(28) inf P{CS|R} > P*. 


In the case of R, it is shown that the infimum of the prob- 


ability of a sbeece selection takes place for configurations 
of the type (0,0,...,0,q,p,...,p) where q<p and the number of 
zeros in the above-configuration is not known. The case of 
the cell with the smallest probability is discussed completely 
because the solution cannot be obtained from the other case. 
lintehtsmease ithe iii IMmunOt the: probabadlaty Oia connect sill 
ection takes™ place for conticurations of the type (pp. sP. 
q) and from the numerical evaluations, it appears to take 
place when all the cell-probabilities are equal. Tables 3 

and 4 below excerpted from Gupta and Nagel [20] give the values 
of C and D for some selected values of n and k. The selection 
procedures for the multinomial population under the indiffer- 
ence zone set-up have been discussed by Bechhofer, Elmaghraby 
and Morse [9], Kesten and Morse [30] and Cacoullos and Sobel 


PT 


146 S.S. Gupta and S. Panchapakesan 





Mabie: 3.') Valuel om ic on Rs = na 
(top line) and .90 
(bottom line) [20] 


n 2° 3°54 o 3° %—) 4 9 
0 dl. dd ad 4) Oss 

2 | 2) 2 1.9 9) ap bi Se 
: 2. i i lo eee 

: | 2. -/ Bia he 2 ike ee 
2. De Bi id, a. wl ee 

| 9 3° 2g ee ea 
i 2 2 2 it ww Ja) te 

? | ee ee a eee 
2 3 SG oe ye eee 

a! : 4 4.4 3 & & oe 
3.3 30 & SS Cee 

Me : 5 & & #40. & Bye 


Tabiles4.. Vailluiel of Diston P= — i> 
(top) Line) Vand 790 
(bottom line) [20] 





Selection and Ranking Procedures 147 


Selection Procedures for Multivariate 


Normal Populations 


Although the procedures discussed in this section are for 
multivariate normal populations, the ranking of the populations 
is in terms of a scalar function and the statistic used in any 
procedure is one which has a univariate distribution involv- 
ing that scalar function as a population parameter. So these 
procedures are also useful in the situations where the obser- 
vations come from the respective univariate distributions. 


Let T 197 se Ty be k p-variate normal populations with 


ee 
mean vectors H- and positive definite covariance matrices ze 
i = 1,2,...,k respectively. We discuss procedures for sel- 
ection in terms of (1) generalized variance (2) distance func- 


tion and (3) multiple correlation coefficient. 


Generalized Variance, |Z, | 


Here we assume MG and Las i=l,...,k to be unknown. From a 
sample of size n from each population, we have the usual 
pete Sy of Dyes 
the jth smallest among [z5|. Ba 12 own oak ald IS] (5) be the 


estimates S -»Z, respectively. Let Lely be 
unknown sample generalized variance associated with lI 151° 
Then for selecting a subset of the populations containing the 
one with the minimum generalized variance, Gnanadesikan and 


Gupta [15] propose the procedure R: Select TT; fast 


1 
where (Slaaas = min (1S, 1>---,18,1) and 0 < c < 1 is determined 


so that the basic probability requirement (2) is satisfied. 
It has been shown that c is the 100(1-P*) percentage point 


Of jM.-., where 
min 
(30) nN... = min (—, —,..., —); 


where Nj» i=1,2,...,k, are k independent random variables, each 
being the product of p independent chi-square variables with 


148 S.S. Gupta and S. Panchapakesan 


degrees of freedom n-1, n-Z2, ..., n-p, respectively. sinesdaus- 
tribution of n,.1s known only for p = 2. In this case nj is 

a central Mo variable with 2(n-2) d.f. and the constant c is 
related to the constant of the procedure of Gupta and Sobel 
[24] which has been tabulated for selected values of n and P* 
in Gupta and Sobel [25] and in Krishnaiah and Armitage [32]. 


When p > 2, an approximation to the distribution of ny’? 


sug- 
gested by Hoel [28] can be used. 

Now we consider a partition of the p variables into two 
sets of q4 and q, Components respectively, 4, + 42 = P- The 


corresponding partition of he is denoted by 


(1) (ag) 
ai 2D 
2 k= sre dim li 2 ierciower atk 
i 3 ; 
5G) 5 (4) 
21 Lid, 
5 fi) pCi) i = 1,2,...,k, are all assumed to be positive def- 
el 22 
inite. We are interested in selecting a subset containing the 
PT Geen es with the smallest|Z; \/[2@) = jxf#) 
ae ae ee \ = O;, say. In other os if we Sotetie 


for each population the conditional distribution of the I set 
when the q, set is fixed, then our criterion of ranking is the 
conditional generalized variance, which provides a justifica- 
tion for the choice of the criterion. If the observations are 
taken on the variables of the q, set, holding the variables of 
the q, set fixed, then the problem reduces to the problem of 
selection in terms of generalized variance discussed above 
with p replaced by q>- If we cannot take observations on the 
G, set holding the q, set fixed, and our criterion is still 
the conditional generalized variance, we can have a procedure 
based on observations taken on all the variables, all being 
random. This could be called the 'unconditional' case. For 
this case, the procedure R is discussed below. 

Let S; be the sample covariance matrix from 7; based on n 
OQoyscanvenesons, sh = WAAR Sho gleg Iie elves jogeemesO Wir S; be de- 
noted by 


Selection and Ranking Procedures 149 





(1) (1) 

S77 372 
Se = > T= Zi Ales 
iE: 

(i) (i) 

S71 $93 
5 Gye (Creer Ge) ea) = ae) 
and let s; IS.|/|Sy4 | = 185, S51 Sty St> |. Then 


to select a subset containing the population with the small- 
est o., Gupta and Panchapakesan [21] propose the procedure 
ING Selec 1; ibtets 
(31) SiS 5 ej ck a 
where 0 < D = D(k,P*,n,q, 45) < 1 is chosen so as to satisfy 
(2). 

The constant D is the 100(1-P*) percentage point of 


C C c 
5 Zz 5 k 
(62) to .- = min (—, >, ..., =) 
AT Ci ca ei 
where, Gas i = 1,2,...,k, are independent and each distributed 
as a product of q> chi-square variables with d.f. n-q,-l,..., 


N-q,-4> respectively. 


Distance Function, i. = u; aie 
eee 1 —1 1-1 


We assume that Uy» i =e se keane Unknown We ramen 
terested in selecting a subset containing the population as- 
sociated with Se Case (a). Des i — wl Zia are. KNOWN". 

We take a sample of size n from each population. Let = 
denote the jth observation from the ith population, j = 1,2, 

lan wiae—= I. Zire ke Renmwe  cOMmpmce, 


(33) Ye x pe hx 


J =1j i ij’ tee Zien tak oly Kini) = Waco cole 


Then the following procedure R is proposed by Gupta and Studden 
[26]. R: Select TT; ibis 


n n 
(34) ) Ve iC) aIlax } Vertis 
jot a Blerck yates 


where © < c = c(k,np,P*) < 1 is determined to satisfy (2). 


150 S.S. Gupta and S. Panchapakesan 
ee eee 


This procedure was considered first by Gupta [19], who 
proved the result concerning the infinum of the probability 
of a correct selection when k=2. Later Gupta and Studden 
proved the result for any k > 2 by obtaining first a suffi- 
cient condition referred to earlier. The values of the con- 
stant c are the same as those for selecting from gamma popu- 
lations the one with the largest scale parameter which are 
given by Gupta [17] for selected values of np and P*. Table 
5 below gives the c values for a few values of P* and v = 
znp. More extensive tables are available in Armitage and 
Krishnaiah [2]. 


Table 5. Values of the constants c in the 
procedure R(34) for P* = .75 (top line), .90 
(middle line) and .95 (bottom line) [17] 





k 
2 3 4 5 10 
v 
- 610 - 486 434 404 woo 
8 - 586 oles . 286 . 268 228 
ol 242 220 - 206 - L7G 
645 526 475 -445 - 580 
10 -430 - 560 eo - 310 - 268 
0 e265 Zonk 247 male 
-671 SOD DON -478 -413 
12 -466 O10 - 564 ~345 ro Oalt 
O02 sOZ0 296 281 247 
a 50 637 SE 565 aOZ 
20 558 -492 - 460 -441 396 
471 -419 394 SOI 341 
5 WAS) sei oy on ILy - 696 -644 
50 694 640 614 oN, S00 
-625 - 580 a6 544 508 


Approximate values of c obtained by using Wilson-Hilferty 
cube root transformation are given by Gupta [19]. 

The rule for selecting the population with the smallest 
X is defined by 


Selection and Ranking Procedures ial 


Roe, De Lect 7; LEE 


1 h 
Fo Seg a) ga 


(35) 
sl <sorcakaay)— al J 


ues 


=1 


where 0 < d = d(k,np,P*) < 1 is determined to satisfy (2). 
The values of d are tabulated in Gupta and Sobel [25] for 
selected values of np and P* and more extensively by Krish- 
naiah and Armitage [32]. Case (b). zs dit Iisa SK, ae 
unknown. 

The rules R and R' of case (a), are modified as follows. 
Bet t= &i S 


ith population and S5 is the usual sample covariance matrix 


Xi» where X; is the sample mean vector of the 


with (n-1) as the divisor. 
R Select 1; Geet 


1: 
(36) Lea max Z 

all 1 1<r<k G 
and 


Ry: Select 7. iff 
i 


min Z 


(37) ee 
yl iarck 


a 


alr 


where 0 < c= c, (k,p,n,P*) << il ginel @ < d) = d, (k,p,n,P*) ae | 
are determined so that (2) is satisfied. 


The values of the constants c, and dy are available in 


il 
Gupta and Panchapakesan [21] for selected values of k, P*, 


q = E and m = a. It can be verified that 


(38) d,(k,P*,q,m) = c,(k,P*,m,q). 


The constants cy and dy are the constants if we consider sim- 
ilar procedures for selecting a subset of k non-central F 
populations with same d.f. containing the population with the 
largest or the smallest non-centrality parameter. They are 
the constants necessary for procedures for p-variate normal 
populations in terms of multiple correlation coefficients 
using a transform of sample multiple correlation coefficient. 


Table 6 gives the values of cy for selected values of q,m,k 


152 S.S. Gupta and S. Panchapakesan 


and P* excerpted from Gupta and Panchapakesan [21] where the 


values are given correct to five significant figures. 


Table Opes \alule' simon cy in procedure 
ISG] (GSi(e))) ae(eyrg eR AIS) (Ceo) 9) 
l 


, .90 (middle), 


) 
-95 (bottom) and k = 3 [21] 





It should be pointed out that the procedures R and R' in 


case (a) are not “strictly analogous” to) Ehose vonvenmunmedsic 

r po) Sep dee (a) 

LS eal! 2 

the corresponding constants c and d turn out to be independent 


(b). If we use procedures based on xs 


of n, the sample size, which is undesirable. Alam and Rizvi 
[1] also consider the problem of selection in terms of dis- 


tance function and they show that sup E(S|R) = k. 
2 


Multiple Correlation Coefficient 

Let Xt = Cone oer eee i = 1,2,...,k be random vectors 
with p-variate normal distributions with unknown mean vectors 
Ls and unknown positive definite covariance matrices Lae The 
multiple correlation coefficient between say, Xi4 and Xi gover 


(1) - 
Xi denoted by P].23.. Oe 


.p j> (9; > 0) is defined by 


Selection and Ranking Procedures 153 


2 [2 


oe pear sick a) 


where O54, is the leading element of ye and Ela) ts the 


matrix obtained from Ls by deleting the first row and first 


column. It is meaningful to rank the populations on the basis 


; = (i) = 
of the values of Pz» i WS 2 eet ke Let 2S. 9.p = R, the 
sample multiple correlation coefficient between Xeq and Xio> 
-oXip be defined analogous to QO; by replacing a by the 


sample covariance matrix S; in the definition of P;- We are 
interested in selecting a subset containing the population 
with the largest (smallest) D> namely, Pray 6? paz): Two cases 
.,X.. are fixed, called the 


2008 ip 
conditional case (2) the case when eno are random, 


arise: (1) the case when Xx; 


the unconditional case. 

Based on RE obtained from a sample of size n > p from ue 
the ith population (i = 1,2,...,k), Gupta and Panchapakesan 
[21] propose the following procedures Ry and Ry for the case 
of Prk] and Pra] GEsSpect ive lays. 

Ry: Select 1; alefests 


(40) RE Eee enlax R? 
1<j<k 
and 
Ro: Sellect m. ditt 
a1 
2 i : 2 
(41) Re <7 min R. 


eke 


were © < € = ess 4 jsjo5m)) «< IU eincl @ < Gl > Gliese 95m) < i aac 
chosen so that (2) 2s" satastied: 

The distribution of Re in the unconditional and the con- 
ditional cases are different, when p # 0. The infimum of the 
probability of a correct selection is shown to be attained 
when op = 0, in which case the central distribution is the 
same in both the cases. The result concerning the infimum is 
proved by using the sufficient conditions referred to earlier. 
The constants c as well as d are the same in both conditional 


154 S.S. Gupta and S. Panchapakesan 


and unconditional cases. Table 7 below provides the values 

of c for some selected values of q = bs, m = a and P* when 
k = 3. More extensive tables are available in Gupta and 
Panchapakesan [21] where the values are given to five signifi- 


cant figures. 


Table 7. Values of c in procedure 
Ry (40)) for Ps =" 275) (top), so 0m Gnade 
dite), 295) Cbhottom)! sand kk = 53 ei2} 


0. 
0. 
OF 
0. 
0. 
0. 
0. 
0. 
0. 


Soro ao 1S} Ooo S 
KP tO on antes a eave 
Soret iro orc) 
<i one. eo) et cs bAekns: 





The constants c and d of Ry and Ro respectively are also the 

constants for similar procedures to select from k non-central 
beta populations with same parameters q and m, a subset con- 

taining the one with the largest (smallest) non-centrality 





parameter. 
Now we (define procedures Rz and Ry which are based on 
cy see 
iy er qian Ti ShMLeI2N snewan css 
1 : x *2 
Re: Select 1. iff R.2 > c, max R= 5 sand 
; eee ss 
Ra? Select 1: Lif Ree. i min Roe 
; sep fe ae 


where 0 < G45 c, (k,P*,q,m) < il gine) 0) < @ 


are determined so that (2) is satisfied. 


j= 4y(k;2* amine 


The constants cy and dy in this case are tabulated in Gupta 
and Panchapakesan [21]. These are same as the constants in 


Selection and Ranking Procedures LSi5 


procedures Ry and R! (36) and (37) for selection in terms of 


1 
distance function, with q and m suitably defined. 
For a few other selection procedures for multivariate nor- 


mal populations, reference could be made to Gnanadesikan [14]. 


A Sequential Procedure For Normal Means 


Let 7 Ty be normal populations with means Ha» i= 


T 
T2008 
1,2,...,k and common known variance coe We assume that the 
means are all known, but only the correct pairing of these 
means with the populations is not known. In fact, for the 
procedure we want to describe, it is enough if the differences 
between successive ordered means are known. Barron [7] defines 
a sequential procedure to select a subset containing the pop- 
ulation with Bel 

We first define a single stage procedure 
R(n): Take a sample of size n from each population and select 

= = oO 
Tegel, Xe >) XK Sa where d > 0 is chosen such that (2) is 
a i = “MeBs n ' 
Satistied. xX., i = 1,2,...,k are the sample means and x = 
de aoe max 

Max (Xj 5++-5X,)- 

At the first stage perform R(1) for each population. De- 
fine Yi. 2 release as eo lullawsi: 


ny 


Oe eles 1; isereyected by RD) 
ieaxt 


iL eps 1; is selected by R(1). 


Now draw an additional sample of size 1 from each population 


and perform R(1) and define Y. lle oe CONG INU maths 


WY, 
process. At stage m, associated with 7. are the random vari- 


i 
ables YiueYuae+ + oYGm: =a aes i ntaea yoo alte 
i 


should be noted that Sam? i =) le. ke ane dependent banomaal 


Define S. = 2 
im 


ums 


variables. 
Two sequences ta} and {b.} of real numbers monotonically 


increasing to infinity are defined such that 


(42) ae OF< bj > a5 < Die Jia Zi alee 


Pin lan < ae < bt} SiO ei =P 2S ek 


156 S.S. Gupta and S. Panchapakesan 





The existence of such sequences is guaranteed by the law of 
the iterated logarithm. 

With this set-up, we define below the sequential procedure 
S, proposed by Barron (WA lire 


S): Tag population un i1=1,2,...,k at the venyebimstescare 
m > 1 when Sin £ (a, »b,) and mark it "rejected" if Sim < 
an and "accepted" if Sim 2 bat Continue sampling from all 
k populations until each one has been tagged. Then accept 


those marked "accepted" and reject those marked "rejected." 


For the particular choice of a, = cm - d and De = em di, 
where c e€ (0,1) and d > 0, P{CS|S,} is evaluated in Barron [7] 
using a one-dimensional random walk on the integer space L = 
{x|x=0,21,542>.,..}.  Expressaions, for E(m,|S,), the expected 
number of stages until ur is tagged are also derived. It can 
be shown that for any c € (Py_y Py) > where P; is the probabil- 
ity of selecting the population with mean iil by, R@)s there 
exists a d > 0 such that for any e > 0 


P{CS|S,(c)} alse) and 


(43) 
i eur E{S|S,(c)} Sl Ck ies 


where S is the number of populations selected. 

Further an approximate minimax rule for choosing a specific 
procedure that minimizes the maximum number of samples needed 
to make a decision on each population, among all procedures 
guaranteeing certain probability conditions, is discussed by 
Barron [7]. For the slippage configuration, namely, Mra] 
nT Sa = Hek-1] =i, Mek] =u + 6, 6 > 0,” Wet nebestne 
sample size for a fixed samplie-size rule and llet M be the vex- 
pected sample size for the sequential procedures. Table 8 
below gives the ratio M/n for a few selected values k and 6 
andor P90 

Barron [7] also considers an eliminating type sequential 
procedure where samples are not taken from a population once 
it is tagged. 


Selection and Ranking Procedures U5 y7/ 


Table 8. Values of M/n under the slippage con- 
fuguGation kom P*9= 9-910) [7] 





Other Types of Formulations 


Studden [40] considers the problem of defining "optimal" 
subset selection rules for the case where we have k fixed 
density functions and only the correct pairing of the densities 
and populations is unknown. He also discusses a decision 
theoretic formulation of the selection problems and obtains a 
solution under the usual symmetry conditions. 

Krishnaiah and Rizvi [33] and Krishnaiah [31] have consider- 
ed procedures for selecting populations better than a standard 
or control population. They have considered many criteria to 
define a population better than the standard. Among the cri- 
teria considered are the ones based on linear combinations of 
the components of the mean vector and linear combinations of 


the elements of the covariance matrix. 


Bibliography 


ieeAtams Kay yand Ralzval, MoH Selection from Multivariate 
Normal Populations," Annals of the Institute of Statist- 
ical Mathematics (Tokyo), XVIII (1966), 307-318. 


Qe Nrmcages J avian) and Kimishnadah., POR | tabiles storm she 
Studentized Largest Chi-Square Distribution and Their 


10. 


a 


2 


Sts 


S.S. Gupta and S. Panchapakesan 





Applications,'' ARL 64-188. Aerospace Research Labora- 
tories, Wright-Patterson Air Force Base, Dayton, Ohio, 
(1964). 


Bahadur, R.R. "On the Problem in the Theory of k Popula- 
tions," Annals of Mathematical Statistics, XXI (1950), 
SiG Sriore 


Barlow, R.E., and Gupta, S.S. "Selection Procedures for 
Restricted Families of Probability Distributions," 
Annals of Mathematical Statistics, XL (1969), 905-917. 


Barlow, R.E., Gupta, S.S., and Panchapakesan, S. "On the 
Distribution of the Maximum and Minimum of Ratios of 
Order Statistics," Annals of Mathematical Statistics, 
XL (1969), 918-934. 


Barlow, R.E., Marshall, A.W., and Proschan, F. "Proper- 
ties of Probability Distributions with Monotone Hazard 
Rate,'' Annals of Mathematical Statistics, XXXIV (1963), 
375-389. 


Barron, A.M. "A Class of Sequential Multiple Decision 
Procedures," Mimeo. Series No. 169, Department of Statist- 
ics, Purdue University, (1968). 


Bechhofer, R.E. "A Single-Sample Multiple Decision Pro- 
cedure for Ranking Means of Normal Populations with 
Known Variances," Annals of Mathematical Statistics, 
XEXEV (EE SISAN) ey GIS) 


Bechhofer, R.E., Elmaghraby, S., and Morse, N. "A Single- 
Sample Multiple Decision Procedure for Selecting the 
Multinomial Event Which has the Highest Probability," 
Annals of Mathematical Statistics, XXX (1959), 102-119. 


Birnbaum, Z.W., Esary, J.D., and Marshall, A.W. "A Sto- 
chastic Characterization of Wear-Out for Components and 
Systems,'' Annals of Mathematical Statistics, XXXVII 
(19166), 816=825% 


Cacoullos, T., and Sobel, M. “An Inverse Sampling Proce- 
dure for Selecting the Most Probable Event in a Multi- 
nomial Distribution," Multivariate Analysis. Edited by 
P.R. Krishnaiah. New York: Academic Press, Inc., 1966, 
423-455. 


Deely, J., and Gupta, S.S. "On_the Properties of Subset 
Selection Procedures," Sankhya, XXX (Series A) (1968), 
57 S0i. 


Dunnett, C.W. "A Multiple Comparison Procedure for Com- 
paring Several Treatments with a Control," Journal of 
the American Statistical Association, L (1955), 1096- 
Zale 


Selection and Ranking Procedures 159 


14. 


Sy. 


16. 


aie 


Sys 


LO), 


20. 


ZANe 


22. 


2 Sye 


24. 


LD. 


Zor 


Gnanadesikan, M. "Some Selection and Ranking Procedures 
for Multi-variate Normal Populations," Unpublished Ph. 
D. Thesis, Department of Statistics, Purdue University, 
1966. 


Gnanadesikan, M., and Gupta, S.S. "Selection Procedures 
for Multivariate Normal Distributions in Terms of 
Measures of Dispersion," (To appear in Technometrics). 


Gupta, S.S. "On a Decision Rule fora Problem in Ranking 
Means),  Mimeo Series No. U50> Institute of Statistics, 
University of North Carolina, Chapel Hill, N.C., (1956). 


Gupta, S.S. "On a Selection and Ranking Procedure for 
Gamma Populations,'' Annals of the Institute of Statist- 


ical Mathematics, XIV (1963), 199-216. 


Gupta, S.S. "On Some Multiple Decision (selection and 
ranking) Rules,'' Technometrics, VII (1965), 225-245. 
Gupta, S.S. "On Some Selection and Ranking Procedures 


for Multivariate Normal Populations Using Distance 
Functions,'' Multivariate Analysis. Edited by P.R. 
Krishnaiah. New York: Academic Press, Inc., 1966, 457- 
ATS. 


Gupta, S.S., and Nagel, K. "On Selection and Ranking Pro- 
cedures and Order Statistics from the Multinomial Dis- 
tribution," Sankhya, XXIX (Series B) (1967), 1-34. 


Gupta, S.S., and Panchapakesan, S. ''Some Selection and 
Ranking Procedures for Multivariate Normal Populations," 
Multivariate Analysis II. Edited by P.R. Krishnaiah. 

New York: Academic Press, Inc., 1969, 475-505. 


Gupta, S.S., and Sobel, M. "On Selecting a Subset which 
Contains All Populations Better than a Standard," Annals 
of Mathematical Statistics, XXIX (1958), 235-244. 


Gupta, S.S., and Sobel, M. “Selecting a Subset Containing 
the Best of Several Binomial Populations," Contributions 


to Probability and Statistics. Stanford: Stanford Uni- 
versity Press, 1960, 224-248. 


Gupta, S.S., and Sobel, M. "On Selecting a Subset Contain- 
ing the Population with the Smallest Variance," Bio- 
metrika, XLIX (1962), 495-507. 


Gupitaypeoeo.),, and Sobel, Mey On ithe Smallest of "Sevenal 
Conmelated F isitaltasities, 4 Biometrika, XUIX (962) S09= 
DS) 


Gupta, S.S., and Studden, W.J. "On Some Selection and 
Ranking Procedures with Applications to Multivariate 
Analysis,'' Mimeo. Series No. 58, Department of Statist- 
1eSy lunduesUnavermsiity,.) (Clo). lomappe cassie Nies wROy. 
Memorial Volume, Statistical Publishing Society, Cal- 
cutta. 


Dal ie 


28. 


“9% 


30. 


Sil 


Si 


Sie 


34. 


Se 


BYE 


38. 


SOR 


40. 


S.S. Gupta and S. Panchapakesan 


Gupta, S.S., and Studden, W.J. ''Some Aspects of Selection 
and Ranking Procedures with Applications,'’ Mimeo. Series 
No. 81, Department of Statistics, Purdue University, 
(1966). 


Hoel, P.G. "A Significance Test for Component Analysis," 
Annals of Mathematical Statistics, VIII (1937), 149-158. 


Karlin, S. Total Positivity. Volume 1) (Stantond ymca: 
Stanford University Press, 1968. 


Kesten, H., and Morse, N. "A Property of the Multinomial 
Distribution," Annals of Mathematical Statistics, XXX 
(oso) LZ aki 


Krishnaiah, P.R. "Selection Procedures Based on Covari- 
ance Matrices of Multivariate Normal Populations," 
Blanch Anniversary Volume. Aerospace Research, United 
States Air Force (1967), 147-160. 


Krishnaiah, P.R., and Armitage, J.V. ''Tables for the 
Studentized Smallest Chi-Square, With Tables and Appli- 
cations," ARL 64-218. Aerospace Research Laboratories, 
Weight-Patterson Aix Force Base, Dayton, sOhmom elon 


Krishnaiah, P.R., and Rizvi, M.H. "Some Procedures for 
Selection of Multivariate Normal Populations Better than 


a Control," Multivariate Analysis. Edited by P.R. Krish- 
naiah. New York: Academic Press, Inc., 1966, 477-490. 


Lehmann, E.L. Testing Statistical Hypothesis. New York: 
John Wiley & Sons, 1959. 


Lehmann, E.L. "Some Model I Problems of Selection,” 
Annals of Mathematical Statistics, XXXII (1961), 990- 
OSZ 


Naylor, Thomas H., Wertz, Kenneth, and Wonnacott, Thomas 
H. "Methods for Analyzing Data from Computer Simulation 
Experiments,"’ Communications of the ACM, X (1967), 703- 
710 


Rizvi, M.H., and Sobel, M. “Nonparametric Procedures) tor 
Selecting a Subset Containing the Population With the 
Largest o-Quantile,"' Annals of Mathematical Statistics, 
XXXVIII (1967), 1788-1803. 


Seal, K.C. "On a Class of Decision Procedures for Ranking 
Means of Normal Populations," Annals of Mathematical 
Sei ones, LOO (UOSS)) 5 S87 = Soe 


Sobel) M.” “Selecting a Subset (Gontaimang sae least sOmemor 
the t Best Populations," Multivariate Analysis II. 
Edited by P.R. Krishnaiah. New York: Academic Press, Inc., 
OIG: 


Studden> WiJ. "On Selecting va) Subset vot ke Populatronsmecon. 
taining the Best,'' Annals of Mathematical Statistics, 
SOXXV GE Glo) lO nO vIse 


Selection and Ranking 
Procedures: A Comment 


John S. Ramberg, University of lowa 


Since Professor Gupta's paper is expository in nature and 
covers recent developments of the "subset" approach to multi- 
ple-decision procedures, I will discuss an alternative approach 
to multiple-decision problems -- the "indifference zone" ap- 
proach and then will consider the relation of multiple-deci- 
sion procedures to some of the other papers presented at this 
symposium. 

The "indifference zone" approach differs from the "subset" 
approach in that the ultimate objective of the "indifference 
zone'' approach is to provide a rational method for determin- 
ing the sample size when the goal is to select the t (often 
t = 1) "best" populations. After determining the sample size 
and taking the observations on each of the populations the 
method of selecting the t "best" populations is usually obvi- 
ous. (E.g., for the problem of selecting the population with 
largest population mean, select the population which yields 
the largest sample mean.) 

For the "indifference zone" approach the practioner must 
specify a probability P* and an "indifference zone" constant 
(say 6*). By consulting the appropriate table he can then 
determine the sample size which will guarantee that the proba- 
bility of selecting the "best" set of t populations is greater 
than or equal to P* whenever the t "best" populations are at 
least 6* better than the remaining populations. 

It may also be of interest to note that the "subset" ap- 
proach and the “indifference zone" approach can be combined 
into a two stage procedure. In the first stage the "subset" 


approach is used to screen out or eliminate many of the "poor" 


162 John S. Ramberg 


populations. In the second stage the "indifference zone” ap- 
proach is used- to select one "best'' population from the re- 
maining populations. 

Since many of the papers given in previous sessions have 
discussed the use of response surface techniques it seems 
important to mention the relationship of multiple decision 
procedures to these techniques. Whereas response surface 
techniques are used when the variables or factors under con- 
sideration are quantitative, multiple decision procedures are 
used for qualitative variables or populations (which themselves 
may be multidimensional and involve continuous variates). The 
goals of multiple-decision procedures are similar to those of 
response surface techniques -- selection of the "best" popu- 
lation versus determination of the levels of the independent 
variables which optimize the response. Both quantitative and 
qualitative variables are often encountered in a simulation 
problem and little has been said concerning the solution of 
these problems. 

I will illustrate how some of the techniques discussed in 
previous sessions can be used in conjunction with multiple- 
decision procedures for designing computer simulation experi- 
ments by using a job-shop scheduling problem as an example. 
In this example the goal is to select a priority system for 
routing jobs through the shop. I will assume that the prac- 
titioner has already selected a performance measure upon 
which he will base his decision and has a set of k priority 
systems from which he desires to select the "best" one. (E.g., 
First come first serve, shortest operation rule, due data 
rule). I'll also assume that questions concerning the run-in 
time, length of an individual run, etc. have already been de- 
cided and the only question remaining concerns the number of 
runs (replications) to be made using each of the priority 
rules. Furthermore, I'll assume that the performance measure 
is a location parameter, the distribution of the simulation 


results are not too far from being normal, and the variance of 


Selection and Ranking Procedures: A Comment NGS 





this distribution and the correlation between the results gen- 
erated by the original random variates and those generated by 
the antithetic variates are the same, respectively, for each 
priority rule, but are unknown, 

For this problem the sample size n required is a function 
Ceuken el.) oO" and theswardance ot theyprocess., The sample 
size requirement can be reduced by using some of the variance 
reduction techniques discussed previously. In particular, 
one might use correlated sampling (or blocking) by choosing 
the same random number seed for each priority rule, thus sub- 
jecting each of the rules to the same event sequence. In ad- 
dition, one could use antithetic variables within each random 
number seed-priority rule combination to reduce the variabil- 
ity of each particular cell estimate. In Table 1 an example 
layout of a randomized block experimental design is given. 
The Xian are the values of the performance measure for random 


mW 


number seed i,| and priority j. The subscript m i's either “o” 
(original sequence of random numbers) or "a" (antithetic se- 
quence). 

Since the variance is unknown one plausible way of proceed- 
ing is to run the simulation for a "reasonable" number of 
trials for each of the k priority rules. After computing 
Bia ine: “ida 
can be analyzed in the usual manner using the "interaction" 


)/2 for each cell, the experimental design 


sum of squares for estimating the residual variance. (Most 
experimental design texts contain discussions of randomized 
blocks). 

The number of additional samples required can then be com- 
puted using the results of Bechhofer, Dunnett, and Sobel [2]. 
(See also an example problem given by (Naylor, Wertz and 
Wonnacott [36]), () refer to references in Gupta Paper.) 

A measure of the efficiency of the antithetic variables 
and blocking can be obtained by calculating the correlation 
between the ne and waa and computing the sum of squares due 


to blocking, respectively. 


164 John S. Ramberg 


The central point is that both formulations of multiple 
decision problems can be used within designed experiments, 
and when the goal is to select the "best" population, the 
"indifference zone" approach can be used to determine the 
sample size requirement. Some additional references for mul- 
tiple decision procedures which may be useful are [1], [3] 
and [4]. 


Table 1 


A multiple-decision Computer simulation 
experimental design using antithetic 
variates and blocking 
(correlated sampling). 


Priority Rules 





Random Number Seeds 


Bibliography 


1. Bechhofer, R.E. "Single-Stage Procedures for Ranking 
Multiply-Classified Variances of Normal Populations," 
Technometrics, X (4) (1968), 693-714. 


2. Bechhofer, R.E., Dunnett, Charles, and Sobel, Milton. "A 
Two-Sample Multiple Decision Procedure for Ranking Means 
of Normal Populations with a Common Unknown Variance," 
Biometrika. Xi W954). Pants) ands 270s Or 


3. Bechhofer, R.E., Kiefer, J., and Sobel, Milton. Sequential 
Identification and Ranking Procedures, Statistical Re- 
search Monographs, II1. Chicago: University of Chicago 
Press, 1968. 


4, Naylor, Thomas H., and Burdick, Donald S. "Design of Com- 
puter Simulation Experiments for Industrial Systems," 
Communications of the ACM, IX (1966), 329-339. 


Time Series Analysis 


Donald Watts, University of Wisconsin 


Introduction 


The analysis of time series is extremely important in com- 
puter simulation experiments with econometric models [5,6]. 
Particular applications include analysis of output results, 
comparison of simulator outputs with real-world data, and 
modeling of real-world data for simulator inputs. This paper 
describes the practice of methods of time series analysis with 
particular reference to the paper by Fishman and Kiviat [5]. 
The advantages and disadvantages of the methods are discussed 
with regard to their applicability to simulation. 


Methods of Time Series Analysis 


Most time series analyses falls into one of the three cate- 
gories; correlation analysis, spectral analysis, or parametric 
modeling. Each of these methods attempts to answer a differ- 
ent question and hence may have advantages or disadvantages 
depending on the situation. A disadvantage in a particular 
Situation should not preclude the use of a method, however, 
for much is to be gained through complementary usage. Examples 
of this complementary usage will be given using the time ser- 
ies shown in Figures 1 and 2. Most of the mathematical detail 


is given elsewhere [3,7]. 


Correlation Analysis 

Correlation analysis attempts to answer the question "how 
much does the value of a time series at time t depend on the 
value at some previous time, say t-t?" Thus the correlation 


166 Donald G. Watts 





14 
LO 
6 
0 40 80 
t, Year 
Figure 1. Normalized annual flow, 
Missouri River, 1897 to 1955 
Temperature 


425 


415 


406 





39, 


\2 24 36 48 60 Me 84 Oo mat 
(Hours) 


Figure 2. Temperature of input to 
a chemical reactor 


Time Series Analysis 167 


function p(t), which gives the correlation between observa- 
tions separated by t units of time, provides a measure of the 
"memory" of a stochastic process. Alternatively, the correla- 
tion function provides a measure of the inertia or momentum of 


the stochastic process. 


Properties of correlation function estimates. Interpreta- 
tion of correlation function estimates requires considerable 
caution [1,2,7]. The main disadvantage with them is that, 
except for purely random or white noise, correlation estimators 
are highly autocorrelated. This can produce misleading cor- 
relation function estimates in the sense that they do not re- 
duce as rapidly as they should or they oscillate too strongly. 
On the other hand, the correlation function estimator is con- 
sistent, so that given a very long record the estimator tends 
EOMLNeN EEE COKE atone unc tilon. 

Another disadvantage of the correlation function is that it 
is a descriptive or nonparametric statistic. The analysis still 
requires subjective interpretation and description of the cor- 
relation function. Nevertheless, in most instances the corre- 
lation function is a useful one to compute, to inspect, to 
Mitemp Ee tandem tomnreintenpmetmatregetumthenm analyst Sasi it) as 
particularly useful as an intermediate step in spectral anal- 
ysis [7] for detecting trends and for specifying the details 
of the spectral analysis. 


Interpretation of correlation functions. Inspection of a 


correlation function should assist in describing a process. 
Thus, for the correlation function estimate shown in Figure 3, 
it is noted that the function drops very smoothly towards the 
time axis, and suggests that the series has much inertia. 
This inertia may be explained by the storage capacity of the 
Missouri river basin [4]. 

Consider, now, the correlation function shown in Figure 4. 
This function drops rapidly for the first two lags and then 
exhibits a smooth lightly-damped oscillatory motion. This 


indicates that the process is composed of a strong oscillatory 


168 Donald Watts 


————  ———eeeeeFS—sS—StStSFhFeFeehhFhhhhh Tl CORN. eS 


"k 
1.0 
a) 
0 k 
5 10 I5 0 
Lag (years ) 


Figure 3. Autocorrelation function of 
Missouri River data, N = 58 


50 


‘co 









LAG 


-25 ( minutes ) 


-50 


=(5) 
Figure 4. Autocorrelation function of 
temperature data, N = 300 


Time Series Analysis 169 


or sinusoidal component with period approximately 24 minutes, 
plus an almost purely random component of approximately equal 
power. 


Spectral Analysis 


Spectral analysis attempts to answer the question "how is 
the average power or variance of a time series distributed with 
frequency?" The spectrum is a very useful descriptor of time 
series, particularly for physical scientists or engineers, since 
they naturally describe the behavior of a trace or record in 
terms of high or low frequency components or oscillations. The 
temperature trace for example, (Figure 2) has a strong low fre- 
quency oscillation. The Missouri river data exhibits mostly 
low-frequency behavior. 


Properties of spectral estimators. Spectral estimators enjoy 


one major advantage over correlation function estimators, that 
is adjacent spectral estimators are essentially uncorrelated. 
Hence their distribution and estimation properties may be well 
defined. In fact, to a very good approximation no matter what 
the probability distribution of the original process, the spec- 
tral estimator at a frequency f equal to a multiple of 1/T, 
where T is the record length, is distributed proportionally to 
an independent chi-square random variable with two degrees of 
freedom [1,2,7]. As a consequence, spectral estimators are not 
consistent since their variance does not decrease with increas- 
ing record length. Hence, to obtain a useful spectrum esti- 
mate one must resort to taking a weighted average of adjacent 
independent spectral estimates. That is, one must resort to 
Ssmoothane [M2 7] 

Unfortunately, while smoothing reduces the variance of the 
spectral estimators, it also introduces bias. Thus a compro- 
mise must be effected between achieving small variability, 
termed high stability, and small bias, termed high fidelity 
[7]. It is felt that the term resolution, which has been used 


in spectral concepts, [2] is not appropriate since this optical 


170 Donald Watts 


a 


analogy presupposes that one is trying to resolve lines or delta 
functions in the spectrum. Real spectra can never be described 
in this way. 

A practical procedure for estimating and smoothing spectra 
has been fully described in [7]. This procedure uses a window 
closing procedure in which the bandwidth of the spectral window 
is progressively made smaller. Usually three spectral estimates 
are calculated and the changes in them with varying bandwidth (BW) 
are noted. In an ideal situation, the spectrum will change 
markedly as the bandwidth is first narrowed and then will settle 
down indicating that the true detail in the spectrum has been 
revealed. As the bandwidth is further narrowed, sampling vari- 
ability will introduce spurious detail. The trick is in know- 
ing what is real power and What is sampling variability. 

Finally spectra, like correlation functions, are descriptive 


and nonparametric which is disadvantageous in some situations. 


Interpretation of spectra. The spectral plot for the Missouri 
river data is shown in Figure 5. The spectra were computed us- 


ing lags of 6, 12 and 20, £or which the confidence intervals 

and bandwidths are shown. The spectrum reveals little more than 
could be inferred from the correlation function. The spectrum 
for 20 lags shows instability due to the small number of degrees 
of freedom. 

Figure 6 shows spectra for the temperature data Of sFalpumeieZn 
based on 300 observations and lags of 30, 45 and 60. The most 
important features of this plot are the strong peak at about 
.013 cycles per minute (cpm) and the flat spectrum above .02 
cpm. The window closing procedure shows considerable change 
when the number of lags increased from 30 to 45, and very little 
change from 45 to 60. Hence it may be concluded that the peak 
at .013 cpm is adequately defined. Note that for 45 and 60 
lags the bandwidth of the spectral window is narrower than the 
peak and hence little is to be gained by using larger lags. 

The number of lags to be used was decided on the basis of the 


correlation plot. The oscillations are quite diminished by 45 


Time Series Analysis ijt 









LOOK 
500 
= wt i ae 
a 
<= —900F go% Confidence 
° Band 
EOC CRM leathad by nmi recente 
50 


“2000 062 125 188 250 312 375° 437500 


Frequency In Cycles Per Year 


Figure 5. Spectrum estimates for 
Missouri River data, N = 58 


Log 10 of Spectrum 





102! » 043). .063. (083 104. 125 146 -l67 
Frequency In Cycles Per Minute 


-500 Figure 6. Spectrum estimates for 
temperature data, N = 300 


NN 7eZ Donald Watts 


EE — eer rcCOr TC 


lags, and so lags of 30 and 60, giving a ratio of bandwidths 
of 2:1, were chosen. 

A more significant feature of this spectrum is the relatively 
large power level from which the peak emerges. This suggests 
that there are two mechanisms working additively, as this spec- 
trum can be produced by adding approximately white noise to a 
second-order autoregressive process. Such a spectrum could be 
produced when the measured output of an oscillatory system was 
subject to large random disturbances or measurement error. 

The spectrum in this case can be a very valuable aid in describ- 
ing the physical system and in deriving a parametric model. 

Note that this information is also available from the correla- 
tion plot as evidenced by the initial behavior of the correla- 


tion function estimate. 


Parametric Modeling 


Parametric modeling [3,7] attempts to discover the structure 
of a time series so that the residuals, after fitting the model, 
are purely random or white noise. All the information conveyed 
by the time series has thus been extracted so that a future ob- 
servation is composed of a forecast value plus an unpredictable 
white noise term. Linear models for stationary and nonstation- 
ary time series have had extensive development because of their 
physical plausability and mathematical tractability. The param- 
eters in the model often have good physical interpretations 
which add to their usefulness. The main advantage of parametric 
modeling over correlation and spectral analysis is that the 
model may be used for forecasting. In addition, modeling is a 
parametric approach which generally requires few parameters with 
well-defined estimation properties. Finally, development of a 
model is a more intimate and demanding analysis than description 
of functions so that parametric modeling generally yields a 
greater understanding. In some situations such as analyzing or 
comparing simulation outputs, the more descriptive approaches 


of correlation or spectral analysis are more appropriate. Pro- 


Time Series Analysis Te 


viding simulation inputs, however, is best done using a para- 
metric approach. 

It is worthwhile mentioning again, though, that the three 
methods all provide complementary and corroborative informa- 


tron, as will be seen presently. 


Fitting of models. Fitting of linear models to time series 
involves a three-stage iterative procedure consisting of iden- 
tification, estimation, and diagnostic checking [3]. 

Identification consists of using the data and any additional 
knowledge to suggest whether the series can be described as 
stationary or nonstationary, and as moving average (ma), auto- 
regressive (ar), or mixed ma-ar. 

Estimation consists of using the data to estimate, and make 
inferences about, values of the parameters conditional on the 
tentatively identified model. 

Diagnostic checking involves examination of the residuals 
from fitted models, which can result in either 

(a) no indication of model inadequacy, or 


(b) model inadequacy, together with information on how to 
better describe the series. 


Thus the residuals would be examined for any lack of randomness 
using correlation and/or spectral analysis. If the residuals 
exhibit nonrandomness, this information would be used to modify 
the model. The modified model would then be fitted and sub- 
jected to diagnostic checking. 

To illustrate the model fitting procedure, consider the Mis- 
souri river data. As mentioned previously, the autocorrelation 
function shown in Figure 3 damps out very smoothly so that a 
simple first order autoregressive model of the form 
(1) et OC) > Ae 
is suggested. This model is denoted as a (1,0) model. In 


equation (1), Z, is a random variable describing annual flow 


t 
at time t, w is the average annual flow, and AL is assumed a 
purely random Normal white noise process with zero mean. That 


1ST, ELA, Atay] = 0 for all k # 0. Under these assumptions the 


174 Donald Watts 


maximum likelihood or least squares estimate for the parameter 


>, given observations z t= SN as [Sel 


ie 
5 £(z,-2) (2441-2) 
u(z,-2) 
where z = zz,/N. In this example, the final model is 
(2) By mt BAZ i ae 


Note that in this case the inertia of the series is described 
by the large positive value of the parameter. The autocor- 
relation function of the residuals for this model is shown in 
Figure 7. It does not suggest that the model is not appro- 
priate. 

The (1,0) model, equation 2, may be used for forecasting 
future values. Thus, taking conditional expectations at time 
t, 


EZ t= 2 lesz 


= .59 Ze 


t+1/t] 


which is the best one-step-ahead forecast at time t of the 
flow at time t+l. This model has been used to forecast the 
flow for the Missouri river, and the one-step-ahead forecasts 
are shown in Figure 8 superimposed on the actual flows. There 
is good agreement, even during the drought years. 

As a second example, consider the temperature data of Fig- 
ure 2 and its auto correlation function, Figure 4. The behav- 
ior of the autocorrelation function suggestsa (2,2) model of 


the form 


(3) z + A 6 


t > Sir © Poesy peed wee ace 

The random processes Ze and Ay are assumed to have the same 
properties as before; the constants oy and 5 are autoregres- 
sive parameters and 64 and 6, are moving average parameters. 
Estimation of parameters in this model can be done using iter- 
ative or nonlinear estimation [3,7]. This in turn requires spec- 


icfallcaltelon (Of mel talenadMe sm ton 84 and 65, or plots of sums of 


Time Series Analysis 








95°. Confidence Limits 


ay) 0 15 k 






Autocorrelation function of 
Mesuduals atte eit trmp, le) 
model to Missouri River data, 


N = 58 


Figure 7. 





40 80 
t, Year 
Normalized annual flow and 


one-step-ahead forecasts for 
Missouri River data, N = 58 


Ewoume! (8) 


176 Donald Watts 





1 and 95. 


From plots of: sums of squares contours and inspection of 


squares S(6, 85) as functions of values of 90 


correlations of residuals, it is possible to modify or extend 
the model so that the residuals are uncorrelated and have a 
small sum of squares. For this time series, it was necessary 
to extend the (2,2) model to a (2,3) model. The final model 


was 


Z, = 1,90 2, + 97 25 + AL 1.7 Ae 


pe, A, 3 
Fitting of this model was considerably aided by spectral 
analysis. The sum of squares values for the (2,2) model are 


shown below for the initial search in the (6, 85) parameter 


space. 

SUA WSR BYR 2Aoil'S ta 

SG ZS Aneel OSS) e3 

3073 2860 2711 2632 * 

S02 See ZCS 02.09 Sine OF OREO a 

2984 2789 2673 2621 2624 2662 

2906 2734 2648 2626 2661 2729 

ZTOAEZ60)5) Z05)2 eZ 6)S4 ay OZ Core 
The minimum sum of squares occurred at a =) = 60), s = .30 with 
$1 = -.70, $5 = .65. However, many of the correlations of the 


residuals fell outside the approximate 95% confidence limits. 
The spectrum for this process is not similar to the spectrum 
shown in Figure 6 and so another search was made in the (81,95) 
plane, this time around the values 85 = Si, 6, = -.9. A 
second order autoregressive process with these values would 
yield a peak in the spectrum near .012 cpm. So would an auto- 
regressive model plus white noise which, when factored would 
yield a mixed model with similar moving-average parameters. 

The sum of squares values for the new parameter values are 


shown below. 


Time Series Analysis 177 


2764 

2682 

2990 2747 2660 2620 2602 

2909 D592 ZS 7. ZO, 2520 

2577 2399 2370 2387 2421 

3844 3330 2946 2129 DSi 

Lag 1 2 3 4 5 

Autocorrelations of residuals BA Gis abode: 


The minimum here is about 8% lower and high-lag correlations 
are all very small, but the first two correlations are quite 
large, suggesting a serious inadequacy in the model. An addi- 
tional moving average term was tried. Increasing the model to 
(2,3) produced an additional reduction of 3% in the minimum 


sum of squares with very small correlations of the residuals. 


Some Comments on "The Analysis of Simulation- 
Generated Time Series" by George S. Fishman 
and Phitilipy J. Kiviat [5 
In this paper, correlation and spectral analyses were ap- 
plied to data generated from a single-channel queuing system 
simulation. The simulator results departed considerably from 
the theoretical results, as noted by Fishman and Kiviat. The 
theoretical correlation function is smooth and monotonic, but 
the estimated correlation function appears to oscillate with a 
period of around 200-250 hours. This oscillation appears to be 
real and could explain the lack of stabilization of the spectral 
estimates at .0033 and .0067 cph mentioned. This example il- 
lustrates the usefulness of window-closing and overlaying sev- 
eral spectral estimates on one graph, and also the necessity 
of using an adequate number of frequency points. It has been 
suggested [2] that spectral estimates be plotted only at a 
spacing equal to the reciprocal of the number of lags used, but 
this is often not fine enough [7]. 


The correlation function [5, Figure 5] and the spectral 


178 Donald Watts 


estimate [5, Figure 6] in Fishman and Kiviat's article both 
indicate that the one-hour sampling interval is too short. 
The correlation functions shown for the FCFS system would be 
well-enough defined by having a sampling interval of about 20 
hours. This is confirmed by the spectral estimates which show 
the power above .025 cph at about .01 of the maximum. The cor- 
relation function [5, Figure 8] and spectrum [S, Figure 9] for 
the SHOPN mode require a sampling interval of about 5 hours as 
suggested. Having discovered this, it is then appropriate to 
do the spectrum analysis showing the fine detail of the im- 
portant part of the spectrum. 

One of the most significant aspects of the analysis, which 
was not mentioned in the paper, is the fact that the correla- 
tion functions for the FCFS and SHOPN modes both exhibit a 
tendency to oscillate. This should alert the investigator to 
look very closely for peaks in the spectrum and to question 
why the oscillations appear. Perhaps they are caused by the 
simulation input from the random number generator. 

The correlogram [5, Figure 11] for the constant service time 
mode illustrates a nonstationary phenomenon, as mentioned by 
Fishman and Kiviat. Filtering the data by differencing would 
reveal the periodicity in the system much more clearly, and 
would enable the spectrum analysis to be done using many fewer 
lags. 

It is doubtful that parametric models would be useful in 
analyzing or comparing simulator outputs such as those dis- 
cussed above. However, the techniques and concepts used in 
parametric modeling do enhance the spectral analysts abilities. 


Bibliography 


1. Bartlett, M.S. An Introduction to Stochastic Processes. 


New York: Cambridge University Press, 1955. 


2. Blackman, R.B., and Tukey, J.W. The Measurement of Power 
Spectra. New York: Dover Publications, Inc., 1959. 


Time Series Analysis 179 


ee SSS 


3. Box, G.E.P., and Jenkins, Gwilym M. Time Series, Forecast- 
ing, and Control. San Francisco: Holden-Day, Inc. (to 


be published) . 

faecantson Reb. andeMacCormuck, AjJ., and Watts. Donald iG. 
"Linear Random Models of Annual Streamflow Series," (sub- 
mitted to Water Resources Research). 

Teeishman, Geonge S., and Kiviart, Phallap J. "The Analysis 
of Simulation-Generated Time Series,'' Management Science, 
XO (March L967). 525-557. 


6. Granger, C.W.J., and Hatanaka, M. Spectral Analysis of 


Economic @ime Series. Pringeton, Nivss Princeton Univer - 


Sity Press, 1964. 


7. Jenkins, Gwilym M., and Watts, Donald G. Spectral Analysis 
and Its Applications. San Francisco: Holden-Day, Inc., 
1968. 


Statics and Dynamics 
of Simulation 


Timothy Y. Ling, IBM 


Introduction 


In simulation experiments there are two ways a systems anal- 
yst can look at the measures of a system performance [38, p. 
aes 


1. the measures of average performance (static comparisons 
of means, variances, and histograms), and 


2. the measures of time-dependent behavior patterns (a dy- 
namic portrayal of a correlogram, spectrum, and/or auto- 
regressive scheme). 


A sophisticated analyst should look at both (to understand not 
only what level a certain average measure achieves but also how 
it is achieved through simulated time). In other words, a sim- 
ulation analyst is concerned with techniques that move a sim- 
ulation model through time from state to state and with tech- 
niques that draw inference from these movements Gilsen Comdex 
ive a general solution from specific numerical solutions) [47, 
Delonte 

A simulation model is essentially a system state generator 
[38, p. 18]. The description of a system may be viewed as con- 
sisting of two prime components [39, p. 1]: 


1. a static representation which concerns existence (the 
temporary and permanent entities or the exogenous and 
endogenous variables), and 


2. a dynamic representation which concerns changes (the pre- 
vious, current, and future states or the lagged, current, 
and leading variables). 


Models that concentrate on the static components of a system 
tend to be more structural, whereas models that concentrate on 
the dynamic components tend to be more mechanistic. The degree 


to which a model is able to incorporate both features efficientl 


Statics and Dynamics of Simulation 181 


is a function of the assumptions made about its internal oper- 
ations (static and dynamic) and the technique used in its 
implementation [38, pp. 17-18]. 

From the above discussion it is evident that simulation 
model design and implementation have both static and dynamic 
features. A statistical rationale for design and analysis of 
simulation experiments should also include static and dynamic 
elements. It is the merit of digital simulation experiments 
that independent (static) replications of nested interdependent 
(dynamic) behavior pattern can be observed under controlled 
conditions. In order to adapt statistical methods (e.g., a 
static experimental design and dynamic time series analysis) 
to the demands of a computer simulation environment,a synthesis 
of the static and dynamic parts [4, p. 1] of a statistical 
theory is necessary [22, p. 6]. 

Simulation is most frequently employed to study the perfor- 
mance of a system configuration or an operating discipline under 
idealized steady-state conditions. The experimenter seeks in- 
formation about the limiting distribution of the state of the 
simulated system [11, pp. 48-49]. Based on the stationary en- 
vironment of simulation, a synthesis of statistical concepts 
of statics and dynamics is theoretically feasible. This paper 
will take a static-dynamic as well as dynamic-static view of 
the underlying total concept. An extension of the total con- 
cept to some extreme cases will present no difficulty: e.g., 
pure statics for experimentation with a static model with a 
stationary environment, pure dynamics for experimentation with 
a dynamic model with a nonstationary environment, and a degen- 
erate case for deterministic experimentations. 

To facilitate a synthesis of statistical concepts (see 
Figure 1), the static and dynamic aspects of a theory of sim- 
ulation statistics may be outlined as follows [41]: 


Sittat1csic A statistics of random events, the classical 
theory of sampling and expectation related to 
a theory of the distribution of external and 
imtennal vianrances [15> pp. 57/2 —59/0)),. 


182 Timothy Y. Ling 





Static 
Independent 
Replications 

(Random Times) 





Dynamic Successive Repetitions 
(A Correlation with respect to Time Lags) 


Figure 1. Exposition of a statistical 
concept of statics and dynamics 


Dynamics: A statistics of changes over time, particular- 
ly the correlation theory of random functions 
related to the Ergotic Theorem [56, p. 17]. 


Statics and 
Dynamics: A statistics of replicated dynamic experiments, 
particularly a simulation cluster sampling 
theory related to an asymptotic distribution 
theory of autoregressive schemes [50, p. 216]. 


A Statistical, Static-Dynamic Concept 
of Simulation Experiments 


The Technique of Simulation from the Standpoint of Experi- 
mental Sampling 


Simulation modeling is essentially a technique for building 
theories or hypotheses are characterized by the production of 
part or all of the essential (signal) output of a behavior sys- 
tem [9, p.920]. For instance, the principal purpose of sensitiv- 
ity experiments on an economic simulation model is to test con- 
cepts about the total behavior of an economic system. An ap- 
aplied model of this class can yield powerful analytical and 
policy insights [18, p. 1]. Since stochastic variation is a 
characteristic of the real world, the model builder as well 
as the experimenter may introduce some stochastic elements, in 
addition to a deterministic simulation, so as to more faith- 
fully reproduce characteristics of the real world [12, p. 29]. 


Statics and Dynamics of Simulation 183 


To validate statistically a hypothesized simulation model, 
the technique of sensitivity experiments is essentially an ex- 
tension of the technique known as experimental sampling or dis- 
tribution sampling, which has been used in the field of stat- 
Ste Su One Manyayecansn lola pme cl in hel experiumenter meeds 
theory to give structure and purpose to his experiments, and 
the theoretician needs experiments to assess the implications 
and the value of his theory [30, p. 6]. The experimental task 
has not come to replace the theoretical one. On the contrary, 
a scientific experimental investigation should be based on a 
theory or formal hypothesis [12, p. 26]. A good example of 
validation of a hypothesized model is the use of spectral anal- 
ysis of both the real and simulated time series (45, p. 1338]. 

In some respects a simulation experiment may be viewed as 
the generation of stochastic processes by the Monte Carlo 
methods [22, p. 2]. The technique of generating random num- 
bers (i.e., the method of Monte Carlo computations) developed 
for empirical (experimental) sampling can be applied directly 
to simulation experiments. However, in the other aspects, 
simulation is more difficult than empirical sampling, and here 
the classical theory of distribution sampling does not have 
much to offer. The difficulties are due to a lack of inde- 
pendence among time series, nonstationarity of the time series, 
and the large number of parameters involved [51, p. 27]. 

Much of the experimental work (in regard to policy applica- 
tion as well as validation) deals with the problem of infer- 
ence; the experimenter wishes to draw an inference with regard 
to circumstances that are not precisely those in which his ex- 
periments were conducted [12, p. 27]. In order to draw a sim- 
ulation inference for the future, some assurance of validity 
would be provided by a demonstration that, for at least one 
alternative version of the simulated system and one set of 
conditions, the simulator produces results that are statistically 
not inconsistent with the known performance of the real world 
[eZee ep eee ZS 


184 Timothy Y. Ling 





From the above discussions it is evident that simulation is 
simply a type of experimental investigation by means of a com- 
puter [12, p. 26]. It is that subset of Monte Cartominvesmia 
gations that consists of an elemental representation of an oper- 
ation in time [12, p. 8] or a componental representation of 
operating characteristics involving time [46, p. 899] of some 
real system. Here, the reality (for policy application of the 
model validated) implies a capability of construction - not 
present existence [12, p. 8]. Schematically the essence of 
simulation may be described as: 


Simulation = Modeling of Interconnected Processes and/or 
Causal Relations + Monte Carlo Experimentation 
+ Multivariate Sequence Sampling and Inference 
+ Follow-up (Verification and Remodeling) 


Stimulus (cause) X + Response (effect) Y 
+ 


Replication under simulation-controlled conditions Z 
Thus, a simulation experiment is a kind of cause-effect- 
oriented, dynamic-static sampling and processing as it is 
viewed ex ante. The simulation sampling per se may be consid- 
ered ex post as a kind of static-dynamic empirical distribu- 
tion sampling with respect to a measure of the system perfor- 
mance; i.e., a sampling realization of the response stochastic 
process at the output. 


The Static Concept in the Statistical Design and Analysis of 
Simulation Experiments 


The statistical design of an experimental investigation is 
a plan for efficient collection and analysis of data [54, p. 
475]. The design and analysis are typically concerned with 
Statice, statistical probilems [45 pe U2” Under a sita tilomany, 
environment of simulation, it is possible that a statistical 
comparison of two or more alternatives may be based upon mea- 
surements obtained from a steady-state operation of each al- 
ternative (12, p= 34]|. This otters) us) "a statis ticalsratuonaitie 
for adaption of the classical techniques of experimental de- 
sign and the Analysis of Variance (ANOVA) to the design and 
analysis of dynamic simulation experiments (see Figure 2). 


Statics and Dynamics of Simulation 185 


By "statics", we mean randomness and/or equilibrium; by 
"dynamics", we mean time dependence and/or disequilibrium. 
Under the domain of statics, the principle of randomization 
is important because, when randomization is used in a simu- 
lation experiment, tests of significance of treatment (policy) 
effects obtained by the ANOVA will be reliable, regardless of 
the distributions involved. As regards estimation, randomiza- 
tion also ensures that any comparison of treatments is estimated 
without bias by the same comparison of the observed responses 
B7ee pe lS. 

Schematically, the concept of randomness may be explained by: 

Randomness = Identical Distributions + Independent Sampling. 

The usefulness of randomization is characterized by repre- 
sentativeness, unbiasedness, and efficiency. 

Since physical or socio-economic systems operate under con- 
ditions of uncertainty, it may be desirable to make Monte Carlo 
experiments that will throw light upon the probability distri- 
butions of system performances that can be anticipated. The 
same strategy (a policy) may work very well in some circum- 
stances but poorly in others, so it becomes necessary to take 
into account some major conditions (singled out as blocks for 
investigation) as well as the probability of occurrence of the 
various circumstances [17, p. 644]. The various noise (minor) 
elements may not be controlled experimentally but may be con- 
trolled by the device of randomization: e.g., by the pseudo- 
random number generators without blocking [37, p. 125]. 

On the other hand, the many sources of random variations 
introduced into the experiments tend to mask the effects of 
the treatments. In order to increase the accuracy (unbiased- 
ness and precision) as well as the efficiency (reduction of the 
size of experiment), the technique of statistical, experimental 
design should be applied. In effect, the design of an experi- 
ment is a planned grouping of experimental materials by the 
principles of blocking, balancing (replication) , and confound- 
ame: (ORS pp si 4 


The great bulk of the experimental design techniques, with 


186 Timothy Y. Ling 


LE 


or without response surface mapping, have the ANOVA as an in- 
tended method of’ data analysis [43, p. 326]. It is a technique 
employed to partition the total variations in the data into 
components representing the experimental error, the controlled 
variables (the conditions and treatments), and possibly their 
combined actions (e.g., the higher-order interactions); to de- 
cide which components are statistically important; and to 
estimate the effects of the different sources with a margin of 
error provided by the experiment [48, p. 2]. In other words, 
the ANOVA as well as the design of experiments is essentially 
a technique of comparison of two estimates of the same variance; 
i.e€., an experimental investigation of the external and inter- 
nal consistency of a variance estimate [13, pp. 572-=5765005, 
M5 4/30). 

Note that a statistical significance of the F test by it- 
self is not a rataional basis for action [Sy piers 0NmseeD lutte 
May suggest a follow-up procedure, such as a multiple compari- 
son procedure [48, pp. 19-20] or a multiple range test [33, 
pp. 31-35] in order to know which treatment effects are signif - 
icantly different from the others and for which purposes a con- 
fidence interval estimate of the difference may be made [44, 
p. 1327]. In case the F test fails to reject the null hypoth- 
esis of no difference among the treatments, it may suggest a 
follow-up procedure, such as taking more observations [11, 
p. 53] or a refinement of the experimentation technique: e.g., 
an exercise of sufficient control over external influences 
[LOR a Die eS 

It has been recognized by some statisticians that ANOVA has 
certain deficiencies. These deficiencies, however, do not lie 
in the design aspects of the technique but rather in the types 
of decisions that are made on the basis of data. On the whole, 
tests of significance are less frequently useful in experimental 
work than confidence limits [10, p. 5]. In many experiments 
the hypothesis that a group of treatments all have identical 
effects seems to be obviously unrealistic [1l, p. 532]. The 


real problem is, indeed, to estimate some relative sizes of 


Statics and Dynamics of Simulation 187 


the differences [10, p. 5]. In simulation experiments it is 
highly likely that the experimenter is more interested in 
identifying the best alternative (a treatment) than in simply 
concluding that the alternatives (treatments) are not equiva- 
lene Llp. ool. | Leethesexperimental® objective is tol find the 
best alternative, the second best alternative, etc., a multiple 
ranking procedure would be adequate [49, p. 29]. 

The practice of using an identical sequence of events for 
comparative runs in order to achieve perfect homogeneity of 
the experimental medium has been criticized [11, p. 52]. Such 
a practice makes it conceptually difficult to state a formal 
mathematical model for an experimental design and introduces a 
dependence among the results of the alternative runs [12, 

p- 30]. The writer would agree that the technique of ANOVA as 
well as multiple rankings should not be applied if some very 
basic assumptions, e.g., randomness and normality, do not hold. 
Since a straightforward deterministic implementation of a sim- 
ulation model may not manifest a randomness of the residuals by 
the order of observations, there may be some merit in not mak- 
ing the initial conditions (e.g., Y_y) or other environmental 
variables (Z) of the runs in a set of design identically equal 
but allowing some minor random variations so that the basic 
assumption of independence may be approximately valid [53, p. 
178]. 

An example of the above-mentioned minimum randomization 
approach may be an adaption of the conventional randomized 
block design to the design of simulation experiments. Suppose 
that there are t treatments (X > koh lye sit)) eee Dll OCKsm ort: 
Peplots (255° Teepe —laZ see st). Lhe & treatments may, 
be assigned at random to the t plots within each block. After 
the design of the experiment, the individual runs may be car- 
ried out deterministically. Under this design the comparative 
runs, e.g., Cor aise and (Xp 42451) may have some randomization 
feature because the jth and j'th plots of, say, the first block 
are applied to the treatments, say, Xy and X5, by a device of 
randomization [37, pp. 128-130]. 


188 Timothy Y. Ling 


To analyze the output of a relatively deterministic simula- 
tion, of coursé, we will face the same philosophical objection 
as that of a sequence of pseudo-random numbers generated by a 
purely deterministic rule. This objection might at least 
partially be overcome by taking a pragmatic view that a se- 
quence of residuals may be considered random if it satifies a 
statistical test of randomness; e.g., a run test [43, p. 46]. 
The writer has employed such a nonparametric test with success 
even when a static simulation model is implemented in a 
straightforward deterministic manner [40, p. 13]. 

Given a stationary environment of simulation, the conceptual 
difficulty of satisfying an assumption of randomness in ANOVA 
as well as in a multiple ranking procedure may be overcome by 
taking at least some minimum randomization in the design of an 
experiment and/or in the implementation of the experiment and 
using only the sample (cluster) mean(s) of truncated data as 
the observation(s) per experimental unit. Note that by cluster 
we mean a simulation output of time series that might have been 
observed and recorded at a limitedly stable state. 

The assumption of independence between sampling (cluster) 
means may be extended to an assumption of the independence be- 
tween sampling (cluster) variances. However, if one applies 
the concept of independence for design and analysis of an ex- 
periment to the consecutive individual observations of a 
cluster or clusters, a new concept of the so-called Equivalent 
Degrees of Freedom (EDF) must be introduced and employed [41, 
Chapter 3.4]. 


The Relationship Among the Statistical Notions "Between Wenest = 
ance,’ "Within Variance," and "Total Variance'' in Regard to 
Replicated Simulation Cluster Sampling 

The experimental design phase of a simulation experiment 
discussed above deals with the problem of "strategic" planning. 

Once an experiment is defined and designed [33, p. 4], a "'tac- 

tical" planning should be made with regard to how each compar- 

ative run (or its replicates) is to be executed [11, p. 4]: 

e.g., a simulation (replicated) cluster sampling plan. 


Statics and Dynamics of Simulation 189 





Cluster sampling is typically used in a sample survey of a 
human population. The technique of cluster sampling offers a 
convenient way for selection of a sample from a population 
whose elements are groups or clusters of some elementary units 
[iS2ZeePpeeDie Lt as) recognized that, because of the clustering 
eftect (54, pp. 320-322] of human characteristics, there is 
always some serial (door-to-door) correlation within the 
clusters even though the clusters (a primary sampling unit) 
are drawn at random [14, pp. 481-484]. 

We may adapt the notion of cluster sampling [13, pp. 192- 
201] to the interpretation, or even to the guidance, of simu- 
lation experiments. The population created by the experiment 
is the response ofacertain measure of performance Y at the 
output of the system, corresponding to a set of treatment com- 
binations {X} defined; a set of useful signal conditions {Z} 
designed; and a set of equivalent reasonable initializations 
[11, p. 51] including the predetermined endogenous variables, 


say {Y ,}, planned at the input of the system [23, p. 1]. The 


eae the nested consecutive observations of the response, 
each at an elementary sampling unit (the period for observa- 
tion) chronologically ordered in sequence [20, p. 9]. The 

size of a cluster N may be planned in advance for a given level 
of statistical accuracy or probability of correct rankings 
balanced against the expenditure of computer time [24, p. 1]. 
The size of a cluster and/or the size of its replication may 
vary in a sequential sampling. 

We have noted above that a simulation model is often pre- 
sented with an environment that is itself a stationary sto- 
chastic process [43, p. 120]; i.e., events of a dynamic-causal 
system are random variables obtained from time-independent dis- 
tributions [1l, p. 49]. The output of the system may also be 
a stochastic process that may not be unstable [26, p. 51]. If 
there is some trend or definite pattern of periodic movements 
in the time-series output, after ruling out this signal trend 
and/or the cycles, the residual disturbances may be a stochastic 
process, stable and/or stationary [31, pp. 108-138]- 


190 Timothy Y. Ling 


-_-_--?? ee 


One may also note that the digital computer simulation ex- 
periments can be implemented by independent replications [20, 
Diz) OO le cenn 


1. the initialization of any replicate run can be made the 
same ; 


2. the noise inputs of any run can be made purely random 
or in some sophisticated way such as the antithetic var- 
PabLes, [55,5) ppee L417 Silemaned 


3. the macroscopic factors defined and designed can be made 
the same for any replicate run and accordingly be held 
constant during the entire period for which the stochas- 
tic process of the response is to be observed and also 
during adjacent time intervals, which will be long 
aa to allow any transient processes to die out [56, 

Based on the technical capabilities of the digital computer 

simulation experiments and the possible stability conditions 

of some sequence series at the input and output of the simu- 
lated dynamical system [23 pp. 1-24; 50, pp. 185=2350)) one 

may adapt the classical statistical concepts of (1) the (static) 
experimental sampling and (general) regression analysis [30, 

pp. 50-75; 37), pp. 10-16, pp. 38-66), (2) the: (state jmrelisien, 
sampling and (intraclass) correlation analysis [13, pp. 192-201], 
and (3) the (dynamic) time series or random function sampling 
and spectral analysis 55, ‘pp. 54-5295 56, ppl. ai=Zei eeomene 
peculiar demands of simulation experiments. One may thus for- 
mulate some static and dynamic concept of, say, the simulation 
(replicated) cluster sampling and multiregression analysis (see 
{41]) in order to improve the understanding (and possibly to 
improve the techniques) of simulation experiments. 

The simulation cluster sampling proper gives the following 

Formulas: \P1Si. spp 92 915-5 ATS pp Sl aABioyle: 


(1) EY = b 

(2) c4 = var Y= N “o"(1 + W-I)ol 

(3) oy = N10°(N-1) (1-9) 

(4) oe + on =o" 

(5) p sploguy CI) Voea lic” = pos Gea 


Statics and Dynamics of Simulation 191 





where Y = the sample cluster mean, say, the ith, of N con- 
secutive observations after a truncation of the 
transient data (the response corresponding to a 
treatment combination {X} defined and a block {Z} 
designed, given a set of reasonable initializations) 


of = the external variance between clusters, 

Ow = the average internal variance within clusters, 

o* = the variance of the population or process speci- 
fied by {X}, 

op = the coefficient of intraclass correlation, and 

0° = the coefficient of interclass dispersion. 


Since a replicated sample cluster of a size N under the 
stationary environment of simulation experiment is viewed as 
a replicated (across-the-ensemble) N-dimensional random vari- 
able of a stationary time series, we may adapt the classical 
(static) concept of intraclass correlation to a dynamic concept 
of autocorrelation and call the former the autocorrelation-in- 
average, given the same N-dimensional sampling [55, p. 522; 
MRS pie DON: 


(6) Og = No |N + 2 (N-L) p, | 


(M e=wia-n“|2) w-no|, 


=1 
where Pre the coefficient of autocorrelation for lag L [52, 
pie ZA. 

The quantity (N-1)p in (2) measures the excess variations 
of a cluster sampling (the clustering effect of similarities 
within a cluster) over random sampling [13, p. 199]. Because 
of the slowness of stochastic convergency of a time series and, 
in particular, the technical difficulty of choosing an appro- 
priate number of lag c in order to balance the conflicting 
requirements of resolution and statistical stability, given an 
(untruncated) size N~ of a single sample time series ims pec 
tral analysis [20, p. 31; 45, p. 1354], one may suggest an ap- 
plication of the method of replication for gaining efficiency 
[ZO peor 


192 Timothy Y. Ling 


LLL LL 


However, it must be recognized that a pure replication with- 
out nested repetitions (a multiple of equivalent independent 
observations) may lead to excessive computer time and fail to 
yield the type of information that is desired about a parti- 
cular time series: namely, a dynamic measure [44, p. 1328; 20, 
p. 70]. The question is, then, whether the losses an) efficiency 
arising from the intraclass correlation (or autocorrelation-in- 
average) within a cluster are offset by the decreases in the 
cost of collecting the information, as opposed to the higher 
cost of collection of the information from relatively random 
observations through replications (the replicated sample means 
may be more widely dispersed). It will always pay to use clus- 
ters of convenient and economic size N, taking enough of them 
(say, of a size m) to attain the accuracy desired [13, p. 190]. 

One may also note from (4) that the population variance o 
of a stationary time series is a constant (at least asympto- 
tically) that might be estimated by summing up its two compon- 
ent variances estimated, as Ba = Se 


estimation is independent of the cluster size N chosen and the 


+ ae The procedure of 


intraclass correlation (or autocorrelation-in-average) p in 
existence. The condition of independence just mentioned may 
benefit the experimenter by permitting some degree of flexi- 
bility in planning the total size of cluster sampling: namely 
mN. In other words, the smaller the cluster size N planned, 
the smaller the average internal variance within clusters Se 
would be; at the same time, the larger the external variance 
between these clusters Ses if replicated. Consequently, a 
larger size of replication m might be required for increasing 
the precision (or accuracy) desired. Moreover, (4) suggests 
that one can construct a confidence interval on the sampled 
mean of cluster means with an estimated variance by a compon- 
ent on for an estimate of the mean response or by a total 

Ge + S,) for an estimate of individual responses [16, pp. 
194-195]. An extension of this approach is a new multiregres- 
sion method that can benefit the experimenter by permitting 
seven levels of estimation [41]. 


Statics and Dynamics of Simulation 193 


The Dynamic Concept in Fitting a Stochastic Difference 


Equation 


So far we have discussed a concept of statistical statics. 
Theoretically speaking, the statics is a limiting case of 
dynamics. For instance, the variance of sampling cluster mean 
on as well as the average internal variance within clusters oe 
depends (at least asymptotically) upon the coefficient of auto- 
correlation Pr The most important and significant correlation 
that must be considered in the assessment of precision as well 
as an understanding of the dynamic nature of the system simu- 
lated is the autocorrelation between adjacent observations Py 
[pane Dey lies 

Let Gs be an ordered observation at the output of the sim- 
ulated system corresponding to {X},{Z} and certain initializa- 
tions in a sample time series of a cluster, replicated or not, 
fle Z2ee ee a Ginenuassamnplemcilusiten, thesdata may, be fitted 
to a stochastic difference equation of an order k. The simplest 
(but the most important and commonly used) model is a first- 
order autoregressive scheme, say, of a lagged variable [52, 

PPE ZOD 200)": 


(8) Xe =a eet ay. 4 ace 
Assume: 
2 
(9) ee SOR cle) He,en- = 0, ial |e 


where b~ = the intercept, and 
a = the rate of association (equivalent to the auto- 
correlation P,)- 
If we impose an intial condition Yo = 0 (e.g., starting simu- 
lation at an empty and idle state), the general solution is 
given by [41] 


ie 
Goan ee G-a yale) ) ate 

L=1 iL; 
(Gish) Rete Gen ota) < © when t> 
(12) EYE ep Coan. bh a lea) 


it 


194 Timothy Y. Ling 





(13) EY, = lim EY, = b°(1 aye te 
tro 
Let 
EY, = be 
then 


(14) Bo =’ bla), 


(Note: b~ and a are negatively correlated. ) 


Under a condition |a| < 1, the system (or more precisely, 
the process of one aspect of the system performance of a simu- 
lation model specified) would be stable; i.e., the long-run 
mean of the process might be stationary, hence, an equilibrium 
position [12, p. 35]. Note that the equilibrium is a dynamic 
equilibrium, not in the sense of static equilibrium which is a 
solution of a static model [1, pp. 326-330]. The concept of 
dynamic equilibrium does not deny the existence of runs and 
cycles, nor does it require that the sample cluster be normally 
distributed [12,5 p. S35]. Moreover, retexningetom GZ) em Glonre 
and (14), one can see that the equilibrium position b can only 
be approached but actually never attained [1l, p. 49]. None- 
theless, it is a static (asymptotic) measure of the system per- 
formance such that an estimate of it can be calculated by the 
dynamic pattern of the process sampled, namely, b = b* (eae, 
and/or directly measured by defining a period of observations, 
namely, the sample cluster mean of truncated data [41, Chapter 
3.5]. Ideally, the period for comparison should not be biased 
by either the length of the period or the point at which it 
begins” [25 per sill. 

Note that (10) is derived from an initial condition Yo = 0. 
The stochastic, dynamic output at the transitional beginning 
may have a 'warming-up" pattern if the initializations are 
rather low, short of specifying some ideal, reasonable starting 
conditions [1l, p. 50], corresponding to a set of treatment 
combinations {X} and useful signal circumstances {Z}. The time 
Tequired for the process of stabadlazation (ep. Si i(ommene 
magnitude of rate of association a, other things being equal, 


Statics and Dynamics of Simulation 195 


see (62) of [41]) can be materially reduced by a judicious 
choice of some preloading conditions [12, p. 34]. Since the 
experimenter may never have a priori knowledge of the equili- 
brium conditions, he may make some errors in specifying the 
preloading conditions such that the initializations may be 
rather high. Then, the transitional output (a situation of 
disequilibrium due to the arbitrary preloading) may be charac- 
tenuzed = DyNaeSeteLing -cdown. pattern, imcase O09< a < i) “ihe 
settling-down transient processes are discussed in Chapter 3.2 
Of ole 

Again, note that the sign and magnitude of the rate of as- 
sociation a in (8) would certainly vary, depending upon the 
dynamic nature of the model (e.g., an exponential smoothing 
constant (20, pp. 406-411), the conditions of operation (e.¢., 
the initializations, treatment combination, and environments 
specified), and the length of the measurement pericd (e.g., the 
simulated time unit of a week, a month, or a year). The dy- 
Mamic characteristic a will be estimated by curve-fitting of 
all the observed data (including the transitional beginning). 
The data lost by truncation will be used for a study of dy- 
namics, and the dynamic information can be used to check the 


accuracy of static information: e.g., 


Ce Seas SiosGca de from (4) and (11), 
where o. = the pooled o. and, 
~ yh =a! m n 
eee ape) ars 
1=1 


Moreover, it may be possible to arrange all the replications of 
a replicated cluster sampling plan as a chain job for execution 
in a single run so that the time wasted in taking the model on 
and off the shelf can be eliminated [35, pp. 66-67]. 


A Total Concept of Statics and Dynamics Through Integration 
of Statistical Methods 


In the previous sections we discussed the concepts of statics 
and dynamics in a natural order: namely, model validation and 


196 Timothy Y. Ling 


simulation experiments (the techniques of distribution samp1- 
ing, experimental design, cluster sampling, and autoregressive 
scheme fitting). In this section we will summarize these con- 
cepts and stress a total concept approach under a broader notion 
of simulation cluster sampling. 

From a static point of view, any simulation experiment (sto- 
chastic or degenerate) might be regarded simply as taking a 
sample of clusters (groups of observations) from a population 
(created by the experiment itself) for evaluation or compari- 
son of certain aspects of system performance (a response) under 
different system disciplines [32, pp. 48-50], whereas, from a 
dynamic point of view, a sample cluster generated by a simula- 
tion experiment is actually a dynamic representation o£ the 
response (a process) achieved by moving the simulated model 
through time [47, p. 1]. Then, from a static-dynamic, syn- 
thetic point of view, we may say that an N-dimensional cluster 
sampling is an N-dimensional sampling realization of a digital- 
computer-created stochastic process (a time-series response) 
under a set of macroscopic conditions specified by the experi- 
mental design technique for making a comparison between alter 
native policy applications. 

The total concept of statics and dynamics provides a con- 
ceptual background for adaption and integration of the various 
statistical techniques, such as experimental design, cluster 
sampling, regression analysis, and time-series analysis meOnmd 
higher efficiency of their use (namely, more information, 
higher accuracy, at lower cost) (see Figure 2). The notion of 
simulation cluster sampling proper in its static sense features 
a replicated sampling of cluster means and variances from some 
critical, comparative populations distinguished by the treat- 
ment combinations applied (in case a response surface explora- 
tion is the objective of the experiment). The simulated em- 
pirical observations, which are required to be (in some sense) 
in alimitedly stable state, are then mapped (through a topolog- 
ical factor-level selection approach [3, p. 113]) with a new 
multiregression method devised for a confidence interval esti- 


Statics and Dynamics of Simulation 197 





mate of the responses with a variance by total or by its com- 
ponents. 

This notion, in a dynamic sense, also facilitates an evalua- 
tion of the underlying stochastic process by fitting some signif- 
icant lower-order stochastic linear difference equations 
(through a topological discrete time approach [31, p. 1]) to re- 
veal the contrasting dynamic behavior patterns. The dynamic con- 
cept of the cluster sampling notion can directly be extended to 
include a case where there is an obvious trend and/or cyclic 


movements (see Figure 2). 


Define the Experiment 
Select Treatment Combinations 
EXPERIMENTAL 
Map Response Surface DESIGN "Measure Observations 
with Error Layers 5 in a Limitedly 
Located Optimum Stable State 





Planned Groupings 
Blocking, Balancing § Confounding 
Experimental Errors 


Reveal’ the Effects Observe Responses in 

of Different Treat- Cluster(s) of the 

ment Combinations STATICS eons Treatment Com- 
ination 


Sequential or Fixed 


A Sample Size 
STATISTICAL 
MULTI -REGRESSION THEORY OF 
ANALYSIS AND SIMULATION CLUSTER SAMPLING 
TATISTICAL INFERENCE CLUSTER PROCEDURE 
SAMPLING 
Define the Regression Initialization §& 
Function § Estimate the DYNAMICS Randomization § 
Experimental Error Replication 
Autocorrelated 


Disturbances Take Autocorrelated 


Reveal the Contrast- Data in Equal Size 


ing Dynamic Behavior 
Pattern 


Multivariate § Multirelation 
Study 
STOCHASTIC PROCESS ANALYSIS 





Fit~a_Stochastic Difference Equation 


“(Or Spectral Analysis) 


Deterministic §&/or Stochastic Component 
Univariate or Multivariate 
Autocorrelation §/or Lag Cross-Correlation 


busine sZ.) Astotal concept of States and 
dynamics through the integration 
of statistical methods 


198 Timothy Y. Ling 


The multiregression method is a method of two-stage applica- 
tion of the conventional multiple regression: namely, the re- 
gression of the sampling cluster mean as well as the sampling 
cluster variance in terms of the factors {X} defined by the 
experiment. Accordingly, there are two regression functions 
fitted. By application of (4), one can construct a response 
surface of the average responses i explained by the factors {X} 
along some estimated confidence band [15, p. 169], which can be 
explained also by the independent variables {X}. The shortest 
confidence band is simply the standard error of estimate with 
respect to the sample cluster means about the regression func- 
tion designed. The longest confidence band (the seventh level 
of estimation) is the square root of the sum of the component 
variances estimated and fitted [41]. 

The writer has successfully applied the total concept of 
statics and dynamics to the design and analysis of simulation 
experiments with a model of a management information system of 
a hypothetical manufacturing firm [7, pp. 1-27]. The model was 
written in the General Purpose Systems Simulation programming 
language, which is based on a particle-flow, next-event logic 
[19, p. 8]. Some random sources were simulated in a three-stage 
physical system: raw material procurement, fabrication, and 
assembly and shipping. This multistage feature together with 
an exponential smoothing forecasting algorithm introduced much 
of the dynamics of the model. The model included an accounting 
framework that generated a weekly statement of profit and loss 
(a permanent entity Y in a static sense) [6, p. 78]. 

The experimental design adapted was a central composite 
rotatable second-order design in balanced incomplete blocks 
[10, pp. 342-349]. The two blocks {Z} were the two demand 
functions: step and sinusoid. The two factors {X} defined 
were the planning cycle P and the average information delay 
D, with five levels of each being used. A full factorial ex- 
ploration would involve (5 x 5 x 2) = 50 simulation runs 
(cluster observations). The design shown required only 12 


runs, including 2 replicates for each block at the center for 


Statics and Dynamics of Simulation 199 





assessment of the experimental error. Some equivalent, rea- 
sonable initializations also were planned for each independent 
simulation run such that the stocks of preloading were pro- 
vided in proportions to the initial demand mix and at the 
same time were adjusted between different runs so as to be 
compatible with the planning cycle P designed 75> Me SOJ)< 

The size of cluster observations without truncation N~ was 
iéseand the saze ot a cluster after truncation (N*-c) = N was 
about 13 to 15. For each untruncated cluster a first-order 
autoregressive scheme was fitted, and a regression function 
of the rate of association a (a dynamic comparison) also was 
estimated in terms of the factors Xy = P and X, = D. Two re- 
sponse surfaces (a static comparison) were fitted to the 
cluster average of profits Y (was the result shown in Figure 
3) and the cluster variance of profits ae respectively. The 
goodness of fit of both was found to be statistically highly 
significant. The surface shown (a hyperbolic parabloid) re- 
veals some unsuspected linkages betwetn planning cycles and 
delays and stimulates general insights into model sensitivity 
to the factors shown [6, p. 78]. 





pp _} 
ONE WEEK TWOWEEK THREE WEEK ONE MONTH 
PLANNING CYCLE 


Figure 3. A profit response surface 


200 Timothy Y. Ling 
Conclusion 


This paper has explored in some degree the statistical 
concept of statics and dynamics, and an integrated statistical 
methods approach has been suggested. The paper has discussed 
briefly what might be called "simulation statics" in terms of 
simulation experiments X, Y, and Z. The term simulation statics 
refers to the fact that many experiments are implemented in a 
stationary stochastic process environment. In other words, the 
external characteristics of the phenomenon giving rise to the 
stochastic process are experimentally invariant in time. In 
some cases, however, dynamic experiments are implemented. In 
other words, the macroscopic input may vary as a time function, 
and the experimenter is interested in the dynamic response at 
the output of the system, also as a time function. This type 
of experiment requires new analytical concepts and techniques 
which the writer has called simulation dynamics. 

This paper has been limited to the discussion of only a 
single response equation and has not attempted a full exposi- 
tion of the concept of simulation dynamics. In addition, space 
has permitted only a limited treatment of the important concept 
of simulation inference. The writer plans a fuller treatment 


in a subsequent publication. 


Bibliography 


1. Ackley, G. Macroeconomic Theory. New York: Macmillan Co., 
1961. 


2. Alexandroff, P. Elementary Concept of Topology. New York: 
Novies -Rubihicatd onsen cen mLvlone 


3. Ashby, W.R. An Introduction to Cybernetics. New York: 
John Wiley § Sons, 1966. 


4. Bartlett, M.S. Stochastic Processes, Methods and Applica- 
tion. New York: Cambridge University Press, 1966. 
5. Blackman, R.B., and Tukey, J.W. The Measurement of Power 


Spectra. New York: Dover Publlicaerionsmlnce moor 


6. Boyd, D.F. '"'The Emerging Role of Enterprise Simulation 
Models,'' IBM Advanced Systems Development Division, 1965. 


Statics and Dynamics of Simulation 201 


Ie 


10. 


sll 


2 


este 


14. 


Wee 


16. 


Wie 


18. 


OT 


20. 


Dee 


Lilie 


25K 


Boyd, D.F., and Krasnow, H.S. "A Methodology for the Econ- 
omic Evaluation of Management Information Systems,'' IBM 
Mechinwcaly se pion te ely/ a0 Sr io Sy 


Chernoff, H., and Moses, L.E. Elementary Decision Theory. 
New York: John Wiley §& Sons, 1967. 


Ciznasoinl., GalPsle, , ainvel Salou 5 Jslo/N5  UUSssinibilenesceoynl pe IWinelsyat— 
dual and Group Behavior," American Economic Review, L 
(1960). 


Cochran, W.G., and Cox, G.M. Experimental Designs. 2nd 
edition. New York: John Wiley & Sons, 1957. 


Conway, R.W. “Some Tactical Problems in Digital Simula- 
lon; Mand gemenitgoic lence xn!) (October, 19/63) 


Conway, R.W., Johnson, B.M., and Maxwell, W.L. "Some Pro- 
blems of Digital Systems Simulation," Department of In- 
dustrial and Engineering Administration, Cornell Univer- 
Sin, IA, 


Deming, W.E. Some Theory of Sampling. New York: John 
Wiley. G) Son's sls 


Deming, W.E. Sample Design in Business Research. New York: 
John Wiley & Sons, 1960. 


Deming, W.E. Statistical Adjustment of Data. New York: 
Dover) Publvcativonsh nese 64s 


Dixon, Wolo GioiGl WASSEyy, adas Vies iMiniesqoysloyeresonn ico) Sees 
istical Analysis. New York: McGraw-Hill Book Co., 1957. 


Duesenberry, James, et al. The Brookings Quarterly Econ- 
ometric Model of the U.S. Chicago: Rand McNally § Co., 
1965. 


Dym, Charles H. "Preliminary Description of National 
Economic Model,"' IBM Advanced Systems Development Di- 
VAS MOT n 90:5) 


lwains,, Caliicg M5 Wellilaees Gale 5 Ehoicel Swhewesglemcl) gb, Sri 
ulation Using Digital Computer. Englewood Cliffs, N.J.: 
Prentice =r allul liner Ooi 


Fishman G.om, and Kinvaat a Pad. “Spectral Analysis ot 
Time Series Generated by Simulation Models," The RAND 
Corporation, Memo RM-4393-PR, February, 1965. 


Fishman, G.S. ''Problems in the Statistical Analysis of 
Simulation Experiments: The Comparison of Means and the 
Length of Sample Records," The RAND Corporation, Memo 
RM-4880-PR, February, 1966. 


Fishman, G.S. "Statistical Considerations in Computer 
Simulation Experiments," The RAND Corporation, AD653174, 
May, 1967. 


Fishman, G.S. "Digital Computer Simulation: Input-Output 
Analysis," The RAND Corporation, Memo RM-5540-PR, 
February, 1968. 


202 Timothy Y. Ling 





24. Fishman, G.S. "Digital Computer Simulation: The Allocation 
of Computer Time in Comparing Simulation Experiments," 
The RAND Corporation, Memo RM-5288-1-PR, October, 1967. 


25. Flagle, C.D., Huggins, W.H., and Roy, R.H. Operation 


Research and Systems Engineering. Baltimore, Md.: Johns 
Hopkins Press, 1960. 


26. Forrester, J.W. Industrial Dynamics. Cambridge: M.1.T. 
Press, 1961. 


27. Gamow, G. One, Two, Three - Infinity. New York: The Vik- 
ing Press, 1961. 


28. Goldberger, A.S. Econometric Theory. New York: John Wiley 
& Sons, 1964. 


29. Grenander, Ulf, and Rosenblatt, M. Statistical Analysis 
of Stationary Time Series. New York: John Wiley §& Sons, 
NOS 


30. Hammersley, J.M., and Handscomb, D.C. Monte Carlo Methods. 
New York: John Wiley §& Sons, 1964. 


31. Hannan, E.J. Time Series Analysis. New York: John Wiley 
& Sons, 1960. 


32. Hansen, M.H., Hurwitz, W.N., and Madow, W.G. Sample 
Survey Method and Theory. Vol. I: Methods and Applica- 
tions. New York: John Wiley § Sons, 1962. 

33. Hicks, C.R. Fundamental Concepts in the Design of Experi- 
ments. New York: Holt Rinehart § Winston, 1964. 

34. Hoel, P.G. Introduction to Mathematical Statistics. New 
York: John Mwamley SGmSons lvoe 


35. IBM Reference Manual 709/7090 FORTRAN Programming System, 
C28-6054-2, 1961. 


36. Johnston, J. Econometric Methods. New York: McGraw-Hill 
BOOK GonLolOor 


37. Kempthorne, Oscar. The Design and Analysis of Experiments. 
New York: John Wiley §& Sons, 1962. 

$8. Kiviat, P.J. Digital Computer Simulation, ModelanemGon- 
cept. Santa Monica, Calif.: The RAND Corporation (AD65 
8429) , 


August, 1967. 
39. Krasnow, H.S. "Highlights of a Dynamic System Description 
Language,'' IBM Advanced Systems Development Division, 
Techniacal® Report. Wools Maly. ntS) OlOre 


40. Ling, T.Y. "Statistical Design and Analysis of Determin- 
istic Simulation Experiments with a Static, Socioeconomic 
Model of the U.S.,' IBM, Technical Report (forthcoming). 


41, Ling, T.Y. “A Statistical Concept Of startles andaiymamncsias 
IBM Advanced Systems Development Division, Technical 
Report 17-235, November, 1968. 


Statics and Dynamics of Simulation 203 


42. 


43. 


44, 


45. 


46. 


47. 


48, 


49. 


510) 


Sule 


52s 
Die 


54. 


eho 


56. 


Mechanic, H., and McKay, W. '"'Confidence Intervals for 
Averages of Dependent Data in Simulations," IBM Advanced 
Systems Development Division, Technical Memo 17-7008, 
1964. 


Naylor, Thomas H., Balintfy, Joseph L., Burdick, Donald S., 


and Chu, Kong. Computer Simulation Techniques. New York: 
John Wiley § Sons, 1966. 


Naylor, Thomas H., Burdick, Donald S., and Sasser, W. Earl. 
"Computer Simulation Experiments with Economic Systems: 
The Problems of Experimental Design," The American Sta- 
tistical Association Journal, (December, 1967), 1315-1337. 


Naylor. homas Hi: Wallace, Wm. H.5 and Sassen, W. Bari’. 
"A Computer Simulation Model of the Textile Industry," 
The American Statistical Association Journal, LXII (320) 


(December ,1967), 1338-1364. 


Orcutt, G.H. "Simulation of Economic Systems," The American 
Economic Review, L (5) (December, 1960). 


Orcutt, G.H. "Simulation of Economic Systems: Model De- 
scription and Solution," University of Wisconsin, Draft, 
(1964). 


Peng, K.C. The Design and Analysis of Scientific Experi- 
ments. Reading, Mass.: Addison-Wesley Publication Co., 
1967. 


Sasser, W. Earl, and Naylor, Thomas H. ''Computer Simula- 
tion of Economic Systems - An Example Model," Simulation, 
@anuanry. 1967))5) 21-3527, 


Sveshnikov, A.A. Problems in Probability Theory, Mathe- 
matical Statistics and Theory of Random Function. Trans- 
Tated by Spripta Technica, Inc. Philadelphia: W.B. 


Saunders Co., 1968. 


Teichroew, D. "A History of Distributions Sampling Prior 
to the Eraofthe Computer and Its Relevance to Simulation, 
American Statistical Association Journal, (March, 1965). 


Tintner, G. Econometrics. New York: John Wiley & Sons, 
1952. 


Tocher, K.D. The Art of Simulation. Princeton, N-J.: D. 
Van Nostrand Co., 1963. 


Wallis, W.A., and Roberts, H.V. Statistics, a New Approach. 
New York: The Free Press, 1959. 


Wilks, S.S. Mathematical Statistics. New York: John Wiley 
G Sons, 19627 


Yaglom, A.M. An Introduction to the Theory of Stationary 
Random Functions. Englewood Cliffs, N.J.: Prentice-Hall, 


nee O6Sr 





Methodological Problems 


spolon 
2-00 





Simulation Versus 
Analytical Solutions 


Philip Howrey and H. H. Kelejian, 
Princeton University 


Introduction 


In recent years a number of large-scale, dynamic econometric 
models have been constructed [6,8,10]. These models frequently 
contain structural equations which are nonlinear in the endo- 
genous variables so that explicit analytical solutions for the 
reduced-form equations are difficult, if not impossible, to 
obtain. In addition, the parameters of the structural equa- 
tions are often estimated by single-equation methods such as 
two-stage least squares. For these reasons economists have 
come to rely on simulation experiments to investigate the dy- 
namic behavior of econometric models and to assess the validity 
of the system of equations. (A simulation experiment is the 
solution sequence generated by a dynamic model under certain 
specified conditions in which the exogenous variables are us- 
ually taken as given and the model is used to generate the en- 
dogenous variables sequentially.) For the same reasons the 
dynamic multipliers which relate the endogenous variables of 
the system to the predetermined variables are also frequently 
determined by simulation experiments. 

The results of this paper suggest that the role of simula- 
tion as a tool of analysis of econometric models should be re- 
considered. We show that once a linear econometric model has 
been estimated and tested in terms of the known distribution 
theory concerning parameter estimates, simulation experiments 
that are undertaken to investigate the model as an interrelated 
system yield no additional information about the validity of 


the model. Although some of the dynamic properties of linear 


208 P. Howrey and H.H. Kelejian 





models can be inferred from simulation results, an analytical 
technique based on the model itself is available for this pur- 
pose. We also show that the application of nonstochastic sim- 
ulation procedures to econometric models that contain nonlin- 
earities in the endogenous variables yields results that are 
not consistent with the properties of the reduced form of the 
model. An alternative procedure is therefore suggested. Fin- 
ally, the results derived from stochastic simulation of non- 
linear systems are shown to be consistent with the correspond- 
ing reduced-form equations. 


Simulation of Linear Models 


Consider the dynamic structural model 
@ ea - x, By + Yt-132 Pas t=) as aig LORE eee 


where es is a 1 x K vector of observations at time t on the 
endogenous variables; T is a K x K matrix of parameters; Xt 
is a 1 x G vector of observations at time t on the exogenous 
Y,-1 is the 
vector of lagged values of Ye; B, is aK x K matrix lofi pane 


variables; By is a G x K matrix of parameters; 


ameters; uy is a 1 x K vector of disturbance terms at time t. 
The results given subsequently do not depend upon the simple 
lag structure of model (1) since a higher-order system can be 
reduced to a first-order system with the introduction of ap- 
propriately defined artificial variables - see Baumol [2]. We 
assume that the disturbances and the exogenous variable have 
been generated by a stationary stochastic process such that for 
add £ sand! s E[u,|x,] = 90s E[uju, |x] = V, where'V 1s sauKeaek 
matrix of parameters which are independent of the elements of 
X55 and that E[uju,] = 0 for t # s. Finally, we assume that 
the probability limits of the sample moments based on the 
elements of u, and x, are equal to their corresponding expecta- 


fe 1 
tions. 


Simulation Versus Analytical Solutions 209 





Non-Stochastic Simulation 


Assuming that ri - exists, the reduced-form system corres- 
ponding to (1) is 


(2) Wien ee te ee ae Be? 

where Il, = era es a and v, = Sone: The assumptions 
described above imply that the parameters of (2) can be con- 
sistently estimated - see Goldberger [12, Chapter 7]. Let I, 
be a consistent estimate of Tl. (1=1,2) derived from a sample 
of size N so that Ms = Ts ct As where the matrix A; converges 
in probability to the null matrix as the sample size increases 
without limit, i.e., plim A; = 0. Using these definitions, 
the vectors of non-stochastically simulated values of the en- 


dogenous variables are defined as 


(3) ve = x, 1) + ye-ya> te Sa ll 2. teed 
ie or 

That is, the simulated values of the endogenous variables are 
generated sequentially from the estimated reduced-form equa- 
tions withthe exogenous variables set equal to their histor- 
ical values. 

We turn now to an investigation of the relationship between 
the historical values of the endogenous variables, Yeo and 
their simulated counterparts, y¢- Subtracting (2) from (3) 


yields the difference equation 


(4) Weve (eevee Do > Seo yest? “et 
which has as its solution 

in ie rn : 

ee * 2 ea 
(5) Vw are * eae my = ql Ty 


From this relationship between the simulated and historical 

values of the endogenous variables it follows that given t, 
Yo and X = (Xp oe ++ Xp) 
A t 

3 ee . t 

(6) jody eats nee) Meee 


where YG = ae and y¢ =X + ye-v for tel, Tien, 1t is 


210 P. Howrey and H.H. Kelejian 


clear from (6) that 
(7) EL Y.a lke Sp lenmit liye: aprile 
The implication of (7) is that if equation (1) is correctly 
specified, the K scatter diagrams between the elements of Yt 
and the corresponding elements of ys should, for large samples, 
outline 45 degree lines. Theeeares the inherent dynamic pro- 
perties of the process generating the elements of Y,_ can be 
inferred from an examination of the time paths of the elements 
of ys. It would also appear that the results given in (7) sup- 
port the presumption of many authors that a rigorous test or 
validation of an econometric model can be carried out in terms 
of comparisons between the historical and simulated values of 
the endogenous variables - see, e.g., Goldberger [11> pp. 49- 
51], Holt [15], and Fromm and Taubman [10, Chapter 2]. This, 
however, is not the case. For even if A, and A 


1 2 
is clear from (5) that the difference between a particular 


are ignored, it 


element of Y+ and the corresponding element of ye Ts a distur 
bance term which is both autocorrelated and heteroskedastic. 
Therefore the relationship between such elements should not be 
studied in terms of simple correlation analysis. 

In order to derive a relationship between an observable 
function of Vie and one of ys which contains a more manageable 


disturbance vector, consider the linear transformation Lly,] 


Na Y--12 2" We first note from (3) that a[y*] = x Hee Tet 
follows that the vectors ve defined by 

A = E ae 
(8) Vie = Ely le eels 
are the _reduced- form Gesidwadis) ot (2)ieieens Vv, ~ Vee where 
oe are 1, + ¥- yo: This result demonstrates that once the 


classical regression tests concerning the parameters and the 
residuals of an econometric model have been carried out, the 
results of further tests of the model via comparisons of lin- 
ear functions of historical and simulated values of the endo- 


genous variables over the period of estimation contain no ad- 


Simulation Versus Analytical Solutions Zalet 


Dee ee __ ee 


ditional information concerning the validity of the model. 
This means that even if each equation is estimated by a 
single-equation technique, the results of simulation experi- 
ments yield no information concerning the validity of the 
model as an interrelated system. Moreover, if observations 
outside the period of estimation are available, tests of the 
model using such information should be conducted in terms of 
the known multivariate distribution theory concerning fore- 
casting and not in terms of ad hoc comparisons between his- 
torical and simulated values of the endogenous variables. 
For a comprehensive discussion of known econometric results 
concerning the appraisal of econometric models see Christ [5, 
Chapters HOF 


Stochastic Simulation 


We now consider the case of stochastic simulation. That 
is, in each period a random variable is generated and added 
to a quantity such as yt. More formally, we define the vec- 
tor of stochastically simulated values of the endogenous var- 
lables at time t as ye where 


(9) ys = x Ty + yeu There C= 2 isles, 
Yo = Yoo 

where Ey is a 1 x K vector of disturbances at time t generated 
by the experimenter. The distribution of Et US) Tdentreal io 
that estimated for the reduced-form vector Ves We assume that 
Ey is independent of Wn and x5 iON alll steamGe Sy. 

Consider now the large-sample counterpart of (9), i.e., set 
IL, = Tl, and IL, = I,- Then Heads) clean strom (2) and(9))) that 


the process generating Yt is identical to that generating the 
historical values Ye: Therefore, an examination of the time 
paths of the elements of We should yield information concern- 
ing the properties of the process generating the elements of 


Yt: However, it does not follow that the elements of re will, 


on the average, be good predictors of the corresponding ele- 


ZZ P. Howrey and H.H. Kelejian 
ee 


ments of Y,: Im fact, the structure of the relationships be- 
tween the elements of V< and those of Yt is similar to that 
of a model of errors of measurement. It therefore follows 
that if the elements of Y, are regressed on the corresponding 
elements of oe the slope of the regression line will be less 
than unity. 

The analogy with the errors of measurement model is readily 
apparent from a eo of the solution of Yt from (2) with 
the solution of Wie from the large-sample counterpart of (9). 
These two Leieiene can be written as 


pe Z 
t — 
(10) Mg = Come yee y a) ) Mt Mls J 
J 


it 
as - 
Clad) pn a C(Xy ++ 5X 5YQ) . BE ies 2 


where 


_ t 
(a2) C(x, Xo 922+ 9XpoVQq) = Yol, + a i 150 7 
Upon substitution of (11) into (10), a relationship between 
the historical and simulated values of the endogenous vari- 


ables is obtained: 


n ie 
Pe Ss ; z _ 1S 
(13) ds ae Pe ee e;)M 


In order to investigate the relationships between the ele- 
ments of ve and those of vee we now derive the covariance 
matrix Q, = EL (y$)' ® tl: To do this we first note from (9) 


that if the small- Let ete EL GOms an Ty and 1, are ignored 


5 - (7 os z 
GQ) (EDO De.) = Ci), 9 geet Olt ee eae 
where ve - Efe;e,]- The assumptions underlying (9) imply 
(15) EL (yg) 'V4_4] = 0, ee coy esies 


Therefore, from G4), G'S), and) the deftinataon oto 


we have 


(16) 8, = Ef(y$)"0,1 = - 


2 Ene 


t-j Cs 
(ep ee 


umd 


=1 


Simulation Versus Analytical Solutions 213 


enn rrr 


Provided the econometric model is stable, i.e., the character- 
istic roots of I, lie inside the unit circle in the complex 
plane, this covariance matrix converges as t > @. 

This property of Qy can be verified by noting that (16) 
implies that Q, is generated by 


GIeAy, 9. = Vt 38 


Moreover, since Qi, is symmetric, it has a modified square root 
so that the homogeneous part of the difference equation (16A) 


can be written as 


(16B) KK, = 13K 


’ 

eae 

It is clear that Q, will converge to the matrix 2 obtained by 
solving Q - 152I, = Ms provided the solution of the homogen- 
eous system (16B) converges to the null matrix. If the char- 
acteristic roots of 1, are less than unity in absolute value, 


the matrix K, generated by 


gieg) KO = K, Wi 


it. fle 


will converge to the null matrix for any set of initial condi- 
tions. Thus the stability of the econometric model guarantees 
convergence of the covariance matrix Qe. 

Since Ne is a variance-covariance matrix it must be positive 
definite. _Therefore the diagonal elements of a matrix such as 
- ova} 
diagonal elements of Q, are negative which in turn implies 


must be negative. It follows from (16) that the 


that each element of ee is negatively correlated with the cor- 
responding element in Q.- Hence, from (13) and the standard 
results concerning errors of measurement as described, for 
example, by Johnston [19, pp. 148-50], we see that if the 
elements of Y, are regressed on the corresponding elements of 
Yeo the slope of the estimated lines will be less than unity. 
Moreover, the scatter diagrams between the elements of ys and 
those of Yt will not outline 45 degree lines. Therefore, even 


though the process generating ve is identical to that generat- 


214 P. Howrey and H.H. Kelejian 
a ————————————E—E—E—EEEEE EEE ee 


ing eee the elements of Yt will, on the average, fail to pre- 
dict consistently the corresponding elements of Yeo 


Simulation of Nonlinear Models 


In this section we specify a structural model which is non- 
linear in the endogenous variables but linear in the param- 
eters [9]. The reduced-form equations are derived and compared 
to the process generating the simulated values of the endog- 
enous variables. On the basis of such a comparison, it is 
found that the simulated values can be expected to diverge 
Systematically from the corresponding historical values. 

Consider the model 


CLA = Me PO ey ele One 


Xt, Yze_z» and u, are exactly as defined in (1); Hy, 


Hy, and H, aneye GES pPeGt viel Vin miG exer ikon My ne 1 Fehoval M, x K 


where Yer 


matrices of parameters; F(y,> Veen x.) AIS) sala: My vector of 
observations at time t on My functions fie = Ee Vg Ve 1 Xe) 

Each of these functions is assumed to depend upon at least one 
of the endogenous variables (elements of Y,) and an arbitrary 
number of predetermined variables. In addition, at least one 


of the functions f. is assumed to be nonlinear in one or 


aes 
more of the endogenous variables. Similarly, ROY 4-1 Xz) is 
a 1 x M, vector of observations on M, functions Th ele T;(¥¢-1> 
X,J- We retain the assumptions underlying (1) concerning the 


stochastic process generating the elements of Xt and Uy: am 


ally, in order to interpret the K equations of (17) as struc- 
tural equations for the K variables in Ye» We assume Chiaitay ass 


fs contains the jth ellement of Yer then the i,jth element of 


it 
M, is zero. Without this assumption, the equations (17) would 
not be linear in Hy, Hos and H.. 


The Reduced-Form Equations 


Consider the functions fie 1 = 1,2,...,M), appearing in 
(23). Each of these functions can be considered a random var- 


Simulation Versus Analytical Solutions ZA) 


aT 


jable. Then because the mathematical expectation of one var- 
iable conditional upon a set of others is, in general, a fm 
tion of those conditioning variables, we have 


(18) Elif. -|%--¥¢-1! ree toa peo My 


where S5 is a function of the elements of Xe and eeuie eaters 


1 
Seen S_(X,o¥e.4)- (A sufficient condition for the function 
fit to have finite moments is that these functions have fin- 
ite range, a condition which is satisfied in most econometric 


models.) It is clear from (18) that foe can be expressed as 


qs) fs = 


ie eae) ae irae 


-++5M), 
where w,, 1S a stochastic element such that EW plese! = 0 
- see Wold [25]. Using (19) we see that 

(20) FO p> ¥¢-12*t) = S(¥_-1>*4) te Wee 

where S(¥¢-12*4) and Wy Ane laeex: My vectors whose ith elements 
are, respectively, s., and w.,. Substituting (20) into (17) 


at ale 
we have 


La) Vee PP oye iy ao * Es 7 er 
JO ese) = See 

where e, = u, + W,H,. It is clear that Ele, |x ye) = 0: 
and so Elly |e -y | = JV 4-1 %4)> We therefore define the 
system of equations in (21) as the reduced-form equations for 
the elements of Yt in terms of the elements of Xp and Vesa 

It should be evident that the equations in (21) do not 
represent the solution of the system ine G47) tom aehiemendogi> 
enous variables in terms of the elements of Xo Ye-1? and 
linear combinations of the structural disturbances in Uy: For 
instance, assume that the solution of the system for the ith 
element of Yer S8Y Vite in terms of the elements of Xe> Yt-1 
and the structural disturbances is given by 


(22) yore ¥ 4 Ope Ut) > OS Wao gag oc 


Because the endogenous variables do not appear in the same 


216 P. Howrey and H.H. Kelejian 
ETE eee 


functional form in all equations, the functions y; will, in 
general, be noniinear. Then the ith element of TYE Xe) > 
say Ji, = Day eerie) is obtained by assuming 


(23) EL Oey para eo ele eel pes 
Clearly, an additive linear function of the disturbances is 
not the only function which has a mean conditional upon given 
values of the elements of Xt and Wee 

An example may help to clarify and extend the above argu- 
ment. Consider the explicit but simplified version of the 


original system: 


(24) Yoo ~ ae te 


(25) Yor = PoYit-1 + bzexplyy,) + uy,, 


where the disturbances U5 (i = 1,2) are normally distributed 


with means zero, Saath eee cae and covariance 72 # 0. Assume 

that each Uist is not autocorrelated and, further, is inde- 

pendent of xX 
The solution of (24) and (25) for Yot in terms of the pre- 


determined variables and the disturbances is 
(26) Ye = boYie-1 * bzexp(byx,) exp(uz,) + up,- 
2 
Then, because Efexp(u,,)1Xp.¥ 14-7] = exp(oj/2) we note that 


Cr) exp(u,,) = exp (04/2) + Uses 


where B[uz,|x = 0. Substituting (27) into (26) we 


t%1t-1! 

obtain the reduced-form equation for Yot 

C28) 0 Yo = bay gee Peng ee gx I icee 

where by = bsexp(a4/2), and the reduced form disturbance Zin = 

Ur_ + bzexp(b,x,)u,z,. It is clear that EZ |e oY oa = 0. 
In comparing (28) with (26), we see that the deterministic 

part of the reduced-form equation obtained by setting Zz, = 0} 

cannot be derived from (26) by setting the structural distur- 

bances Ut and U5, equal to zero. In brief, the reduced-form 


equation for Yot is not a solution of the system. A related 


Simulation Versus Analytical Solutions 2AGT 


eee 


point is that the reduced form disturbance, 24. lS not ea) Lani 
ear function of the structural disturbances. Indeed, it is 
clear that Ze is heteroskedastic with respect to X, even 
though the structural disturbances are homoskedastic. Fin- 
ally, it should be noted that, in general, if the structural 
disturbances of an econometric model are assumed to be uncor- 
related rather than independently distributed over time, the 
reduced-form disturbances will be autocorrelated since non- 
linear functions of uncorrelated variables are generally cor- 
related. Thus, the properties of the reduced-form distur- 
bances should not be inferred from those of the structural 


disturbances. 


Non-Stochastic Simulation 


In order to simplify the ensuing argument, but leaving the 
results intact, we ignore the problem of estimation and assume, 
instead, that the parameter matrices Hy, H, and Hz are known. 
Since consistent estimation procedures have been developed for 
models such as (17) - see Eisenpress and Greenstadt [7] and 
Kelejian [21] - this amounts to a large-sample analysis of 
the simulation results. 


The 1 x K vector of simulated values ye is defined as 
ca * * * = 
(29) Yt x, Hy zs F (yt sY¢-19%p) 82 =f R(y$_4X,H3> t eis 
zk = 
T Olen 0: 
The solution of (29) for Yt> false) es usually obitaimed™ by, 
numerical and sequential methods - see Evans and Klein [8, 
pp. 39-49]. For instance, given YG and Xy> the system in (29) 
is first solved by iterative procedures for yi- Then, given 


X5 and yi> y5 is obtained, et cetera. Thus, let ve be express- 


ed in general as 
kz = * 
(30) Vo Lae a> Xp) 
where Ty E_1>*4) is a 1 x K vector whose ith element is the 


solution of the system (29) corresponding to the ith element 
of Ye: Assume now that the solution of the original structural 


218 P. Howrey and H.H. Kelejian 
EE Eee 


system (17) is 
(31) Ye THCY 4-1 Xe oUe)- 
Then ait asmcledmsrtharc 
(32) Ye TS a oO)! = 1 OE eee 

In comparing the process (32) generating the vectors of 
simulated values with the process (21) generating the vectors 
of historical values of the endogenous variables, two points 
should be noted. First, because the K functions of T, (#-4> 
x.) are not equal to those of TV 4-4 2%4) > multiplier analysis 
based on non-stochastic simulation yields results which do 
not apply to the corresponding historical values. (For an 
example of multiplier analysis in a nonlinear system based on 
simulation results see Evans and Klein [8, pp. 48-49, and 
Chapter 5]. See also Fromm and Taubman [10), Chapters ie) 


That is, in general 


* =v 
(33) °Y¢ PEL es Ven ew e 
ey eae Yipee 


where, in general, if p = (Py >+++>Py) and q = (qys++-s4)); 
then dp/dq is an m x n matrix whose i,jth element is ap 7/24, 
The second point to note is that the elements of Yt can be 
expected to diverge systematically from the corresponding 
elements of Yes More explicitly, from (350)/=(52)) tecanebpe 
shown that unless the disturbance vector u, is degenerate in 


t 
the sense that all the moments of its elements are ZELON, 


(4) ELGa ye) You Xell = Oly .X,) # 0, 

where Xx = (Xq 20+ X4) and O(yg»X,) 1s a dx) K viectormmosaesunc 
tions of the elements of Yo and Xi. Therefore one would not 
expect the K scatter diagrams between the elements of Wee and 
of yt to outline 45 degree lines. Indeed simulation over a 
period in which the elements of Xt show a trend could lead to 
an increasing divergence between the elements of Vig and those 
of yt even though the econometric model (17) is properly spec- 


Simulation Versus Analytical Solutions AAS; 


Dee SS SS 05050000 


ified. It is clear, therefore, that nonlinear models such as 
(17) should not be validated in terms of comparisons between 
the elements of vee and those of Yer As in the linear case, 
validation should be carried out in terms of the multivariate 
distribution theory corresponding to the estimates of the 
structural parameters and the various tests for randomness 
concerning the structural disturbances. 

The results given above suggest that the properties of dy- 
namic nonlinear models should not be studied in terms of non- 
stochastic simulation procedures. Essentially, the reason 
for this is that such simulation results are based on the 
solutions of the structural equations but these solutions are 
not the reduced-form equations. Now, because of the difficul- 
ties involved in obtaining analytical solutions to a system 
of nonlinear equations and then performing the integrations 
necessary to obtain the conditional expectations, the reduced- 
form equations in (21) will generally be unknown. Hence, some 
approximation is necessary. 

One possibility is to obtain points on the reduced-form 
equations via stochastic simulation. For instance. eet Vig 
be a 1x K vector of disturbances at time t which are gener- 
ated by the experimenter and which have a distribution identi- 
cal to the structural disturbances Uy: Then the stochastically 
simulated values of the endogenous variables are defined in 
terms of equation (29) with the exception that Vip Oe te SMe 
is added to the right hand side of that equation. In this 
Case, it is clear that the solution of the resulting equation 


for Yes now denoted as Y> would be (see 31) 

(SS) ee = To (yee Xp oVe) > a a erenente 

Therefore, if the time path of X, is held constant, and ye 

is repeatedly generated, the ensemble averages of the elements 


of yee would be determined by the reduced-form equations: 


T 
| at Kk i i = KK 
(36) plim N ) T, (y$*, G) Xp 0¥_ OI J(yt*) o*,)) 


220 P. Howrey and H.H. Kelejian 
EE EEE eee 


where yee (i) and v, (i) are the vectors of values of ye" and 
Ve corresponding to the ith simulation. Therefore, the pro- 
perties of the reduced-form equations may be studied in terms 
of the simulation results corresponding to different time 
paths of xX, ~ see Nagar [24] for an example of stochastic sim- 
ulation. However, it should be clear that one would not val- 
idate a nonlinear model in terms of comparisons between the 
elements of Ye and those of Yee The reason for this as that 
the difference between these vectors is a vector of variables 
which are autocorrelated, heteroskedastic, and, in general, 
will have a distribution which is not known. 

If stochastic simulation is not feasible, an alternative 
approach is to assume that each reduced-form equation can be 
approximated by a polynomial. More explicitly, the ith ele- 
ment of TV 4-1 >%t Me Die 3 may be expressed as 


Scie ea 


(37) ue rates 


Jit 
where poi is a polynomial of degree d; in the elements of Xe 
and Yeeln? and 6 4 is the remainder in +e approximation. 
Assume now that is: " 270) as d; +c, Then for suttretent by 


large d., we iS a cy) 


d: 
a i 
Ogee Peas 
where e;, is the ith element of e,- ee lei eee peer 
the ordinary least squares estimate of p. ae say p. at? would, as 


d. + , be a consistent estimate of E[y There- 


it ewe Tee x,]- 
fore the multipliers relating Yit to the predetermined vari- 


able could be approximated analytically by 


adr 
Oyen a ea 


Dynamic Properties of Stochastic Linear Systems 


It was previously suggested that the inherent dynamic pro- 


Simulation Versus Analytical Solutions Za 
ee 


perties of a linear system could be inferred from the simu- 
lated time paths of the endogenous variables of the model even 
though there need not be a close correspondence between the 
simulated and historical values of the endogenous variables. 
The very fact that the simulated and historical values of the 
variables need not correspond indicates that it is the dynamic 
properties of the simulation paths, and not these paths them- 
selves, that are of primary interest. However, the results 

of simulation experiments are difficult to interpret because 
of sampling variability. It is therefore desirable to consi- 
der analytical methods which are not subject to sampling var- 
iability that can be used to infer the dynamic properties of 

a stochastic system. In this section it is shown that the 
dynamic properties of a model can be studied analytically in 
terms of the spectral representation of the solution of a sys- 
tem of stochastic linear difference equations. The method is 
applied to a simple econometric model and the use of the 
implied dynamic characteristics for an investigation Ore elas 


validity of the model is considered. 


The Final Form and Solution of a Linear Econometric Model 


Consider again the reduced-form equations (9) that are used 
in stochastic simulation. For expository purposes, this sys- 


tem is rewritten as 


= 1 1 
(40) A(L)z, Bx ae 


where Zt - i B = Ty, and A(L) = 1-TiL with L defined as the 
lag operator. The final form of the system, the framework 
within which the dynamic properties of an econometric model 
are usually analyzed, can be derived as follows. Let the K 

x K \-matrix a(A) denote the adjoint cf A(A) and let A()\) = 
|A(A)| denote the determinantal polynomial of A(A). Premulti- 
plying (40) by a(L) yields the final form of the econometric 


model: 


(41) WACL)Iz, = a(L)Bxt + a(L)ey 


Dine P. Howrey and H.H. Kelejian 
ee EEE Eee 


where ||A(L)|| is a matrix with A(L) on the main diagonal and 
zeros everywhere else. The final form [11] is thus a system 
of stochastic difference equations, each equation of which 
has the same autoregressive oat AN) 

The method of solving a system of linear difference equa- 
tions with constant coefficients such as (41) is well known 
[2,23]. The complete solution is composed of a particular 
solution of the original system and a general solution of the 
homogenous system obtained by deleting Bx} + et from (40). A 
particular solution can be written formally at sight, namely, 


(42) Zo A(L) 





The complete solution is thus given by 


(43) Zia agli “et Bx! + wet as 


where C is a K x n matrix of constants determined by the ini- 
tial conditions and A is an n-dimensional vector (Ay peer ghy) 
of roots of the nth degree polynomial ) "NOs 1) = 0. (If there 
are repeated roots of the determinantal equation some of the 
columns of C may contain powers of t.) 

The complete solution thus consists of three parts: a so- 
called transient response cat which, provided the system is 
stable, approaches zero as t increases; a component of the 


a(L) 


particular solution A(L) Bx + corresponding to the exogenous 





variables; and a component of the particular solution coat Et 
corresponding to the disturbance terms. The usual method aie 
determining the dynamic properties of the solution is to sup- 
press the stochastic part of the solution and to analyze only 
the deterministic solution [11,22]. This is equivalent to 
looking at the expected value of the time path of the endogen- 
ous variables of the system given the exogenous variables. 

Two kinds of information are obtained from the deterministic 
System. The values of the roots of the determinantal equation 
yield information about the modulus and periodicity of the 
transient response. If the roots are all less than unity in 


Simulation Versus Analytical Solutions Lie 


a SS 
absolute value, the system is stable and approaches the parti- 
cular solution from any set of initial conditions. If complex 
roots occur this is usually taken as an indication that the 
system will tend to oscillate. The periodicity and rate of 
damping of the sinusoidal components contributed by complex 
roots can be ascertained from these roots. Dynamic multipliers 
may be calculated to determine the response of the endogenous 
variables to changes in the exogenous variables. 

There is little doubt that these methods provide interest- 
ing and useful information about the system of equations. For 
short-term forecasting and the formulation of discretionary 
stabilization policy, these techniques may provide a Sift 
cient characterization of the dynamic properties of the model. 
If, however, the longer-term properties of the model are to 
be investigated, it may not be reasonable to disregard the 
impact of the disturbance terms on the time paths of the endog- 
enous variables. (Haavelmo [13] has pointed out the inadequacy 
of a comparison of the solution of the deterministic system 
with the observed series of observations for testing dynamic 
theories.) Neither of the above techniques provides informa- 
tion about the magnitude or correlation properties of devia- 
tions from the expected value of the time path. In what se@UJl = 
lows attention will be focussed on the contribution of the dis- 


turbance terms to the time paths of the variables in the model. 


The Spectral Representation of the Solution 


An analytical description of the properties of the stochast- 
ic response of a system of difference equations can be based on 
the spectral representation of a stochastic process. (A good 
introduction to spectral representations is contained in Yaglom 
[26].) Suppose that the disturbances in the model Est Gre i; 
2,...,K), are generated by a wide-sense stationary process; 
that is, the means, variances, and lagged covariances of the 
disturbances are independent of time. Then the disturbance 


process has the spectral representation 


224 P. Howrey and H.H. Kelejian 
ee EEE EE 


(44) 1 Were fest anic) 


where dU(w) is a K x 1 vector of stochastic functions and i = 
v-l. It is easy to verify that the spectrum matrix of the 


disturbance process is given by [26, joe UNS 
(45) f(w) = E[dU(w)dU*(w) ] 


where dU* is the conjugate transpose of dU. 
Returning to the complete solution of the linear econometric 
model given by (43), the particular solution corresponding to 


the disturbance terms can now be written as 


(46) oz, = Shed Tt eft au(w) 


where Et has now been replaced by its spectral representation. 
Interchanging the order of the operations in this expression 


leads directly to 


(47) Zz, = Ju, et * 1 (w)dU(w) 
where T(w) = a(e *)/A(e 7”) is the K x K transtermnaeee 
iwt 


by a(L)/A(L). The interchange 


of operations involved in going from (46) to (47) is permis- 


obtained by operating on e 


sible provided that each of the elements in the matrix 
anciny A Gite ses converges absolutely. This will be true if 


the roots of the determinantal equation x"a(, 


)) = 0 Waneorte 
modulus less than one so that the system is stable. Provided 
this is the case, this last expression indicates that the 
kernel of the Z, process is dZ(w) = T(w)dU(w). 

The) spectral matnax Foy) = [F;-(w)] of the endogenous vari- 
ables is now obtained using (45) with the appropriate substi- 


tutions: 
(48) F(w) = E[dZ(w)dZ*(w)] = E[T(w)dU(w)dU* (w) T* (w) J 
= 1 ((@))) se ((G)) a3 (@))) 
The spectral matrix, each element of which is a function of 


angular frequency w, provides a compact description of the 


second-moment properties of the stochastic model. The elements 


Simulation Versus Analytical Solutions DLS 


I ———————— 


along the main diagonal are the power spectra of the endogen- 
ous variables and the off-diagonal elements are the cross- 
spectra relating the corresponding endogenous variables. It 
might be noted that if the covariance functions of the endog- 
enous variables are of direct interest, these may be obtained 


by transforming the spectral matrix F(w): 
pm -1ws 
(48A) v5, (s) = f7, e °F, (w)du. 


The use of equations (45), (48), and this transformation pro- 
vides a computationally simple scheme for obtaining these co- 


variance functions. 


An Application of the Spectral Method 


In order to illustrate the application of the spectral rep- 
resentation of the solution of a stochastic system, the method 
is applied to Klein's Model I. This model, described in Klein 
[22], consists of three behavioral equations relating to con- 
sumption expenditure, investment expenditure, and the private 
wage bill, and three identities. The complete six-equation 
system is as follows. 


Coie We? = 0.02 0.87 ew) teur 


(ACE) eee 2 ont OOS ea O Sie OR 7K 


i iD 


(ons we > 15 + 0.45(yeT-W5) + 0.US(y+T-W,) | 


+ 0.13(t-1931) + u, 


(CASA ge tes ei Ge ctanell EG 


(49) eidwe = aaah + ews 


(49.6) AK = I 


The endogenous variables include consumption C, non-wage in- 
come Il, the private wage bill Wy> net investment I, the cap- 
ital stock at the beginning of the year Kap and income Y. 

The exogenous variables are government wage payments W5> busi- 


ness taxes T, government expenditure G, and time t in years. 


226 P. Howrey and H.H. Kelejian 





All variables are measured in constant dollars and the param- 
eters were estimated by the method of limited information max- 
imum likelihood from annual data for 1921-1941. 

The first step in the derivation of the spectrum matrix of 
the system is the computation of the spectrum matrix of the 
disturbance process. On the assumption that the residuals are 


serially uncorrelated, the spectrum matrix of the residuals is 


1.69 0.50 -0.47 SGeueuomeuen 
0.50 2.05 0.29) 20) aimmene 

(50) Ecayeel Eyam = das -0.47 0.29 0.59) OemmOeeand 
0 0 0 0.) Opeao 

0 0 0 0 0 | 

eo 0 0 0) Ln a 


The non-zero entries in this matrix are estimates of the vari- 
ances and covariances of the disturbances (u, ,U,,Uz) given by 
Klein [22, p. 72]. The transfer matrix T(w) is then obtained 


by inverting the complex-valued matrix Aen a) where 


its Oona 8u7 0 -.02 0 
0) ie a0 0 -,08-,68e “yen = 
in 0) Od Sora esses ann 0 
CD) NC ee eee f 0 0 
OR Oe = i 0 : 
0) = 10 0 0 eae 
for selected values of w(0 < w < m). Finally, the spectrum 


matrix is obtained using (48). 
The power spectra of the endogenous variables implied by 

Klein's model I are containea on the main diagonal of the 

matrix T(w)f(o)T* (a) where Tt) = A@)e 


of C, I, and Y are given in Table 1. The significance and 


The power spectra 


interpretation of the power spectra derive from the fact that 
the spectrum provides a decomposition of the variance by fre- 


quency components, i.e., 


(52) Yaa = Jl) Fy aida 


Simulation Versus Analytical Solutions Da 


ee 


where Vee denotes the variance of the ith endogenous variable 
in the system. The results in Table 1 thus indicate that the 
response cf these endogenous variables of this sytem to ran- 
dom disturbances exhibits a fairly regular oscillation with 


a period of approximately 13.3 years. 


Talbites le 


Power spectra of consumption, investment, 
and income implied by Klein's Model I 

















a 
Frequency Power 
Cycles (eer en 
per year Consumption Investment Income 
0/40 0.88 0.00 0.88 
1/40 Oe ORS 2 S09 
2/40 Brerc) Zi o4 14.32 
3/40 9.09 5.26 Tipe Sl 
4/40 7.05 4.51 Elio OS) 
5/40 4.07 2.80 Sys 
6/40 2.44 ea 8.06 
7/40 1.60 ees DZS 
8/40 1.14 0.85 SOZ 
9/40 0.87 0.64 2.64 
10/40 0.70 0.49 2-7 
11/40 0.60 0.40 1.60 
12/40 Ol y515 ORparZ i Ol 
13/40 0.49 OZ Heep lel 
14/40 0.46 0.24 0.96 
15/40 0.44 OA Osis 
16/40 0.42 (05 1) Olea 
17/40 0.41 Oo JLY OnZ 
18/40 0.40 ORG 0.69 
19/40 0.40 0.16 0.67 
20/40 0.40 0) 15) 0.66 








By way of comparison, the characteristic roots of the sys- 
tem are 0.292, and 0.704 + 0.3441. The pair of complex roots 
has a period of 14 years subject to a damping factor of 0.784. 
Thus the periodic response to random disturbances could have 
been anticipated from an examination of the characteristic 
roots of the system. However, it should be noted that the 
existence of complex characteristic roots is neither a neces- 


sary nor a sufficient condition for the implied power spectra 


228 P. Howrey and H.H. Kelejian 





to exhibit an interior relative maximum. (Howrey [16] has 
shown that the existence of complex roots does not necessarily 
imply that the power spectrum has an interior maximum and Chow 
[3] has shown that an interior maximum can exist even though 
the characteristic roots are all real.) The inference of the 
dynamic characteristic roots is therefore somewhat difficult. 
This example indicates that the power spectrum does provide 

a useful description of the stochastic response of the system. 
(For additional examples of the use of spectrum-analytic tech- 
niques to describe the dynamic properties of an econometric 
model, the reader is referred to Howrey [17] and Chow and 
Levitan [4].) 


The Use of Spectral Representations in Validation Studies 


A question that arises quite naturally at this point is the 
extent to which the usefulness of the spectral representation 
is limited to a description of the solution of a system of 
stochastic equations. In particular, it may be of interest 
to consider the applicability of the spectral representation 
to tests of the validity of the model. A rather obvious pro- 
cedure that suggests itself at this point is a comparison of 
the power spectra implied by the model, F(w), with the power 
spectra Ea) estimated directly from the series of observa- 
tions on the endogenous variables. Significant differences 
between the two might then be taken as an indication that the 
model is not correctly specified. 

An alternative time-domain approach might involve a compar- 
ison of the average distance between observed peaks of the 
endogenous variables with the mean distance between peaks 
implied by the model. This is the approach taken, for example, 
by Adelman and Adelman [1]. On the assumption that the dis- 
turbance process is normal, the mean distance between peaks 
can be calculated from the covariance functions [18]. In view 
of the relationship given in (48A), the covariance functions 


can in turn be computed from the spectrum matrix. It follows 


Simulation Versus Analytical Solutions 229 





that the observations on the comparison of the estimated and 
implied power spectra also hold for a comparison of the esti- 
mated mean distance between peaks and the mean distance 
implied by the econometric model. 

Two points should be observed in connection with this ap- 
proach. First, if the model contains no exogenous variables, 
this procedure is tantamount to a test of the hypothesis that 
the residuals (ah, Seater obtained by operating on the reali- 
zation {z,} by A(L) is not significantly different from white 
noise. Although this procedure has the advantage that the 
estimated spectrum may be suggestive of a more appropriate 
model if the original formulation is not adequate, it does not 
represent an advance beyond the classical tests of signifi- 
cance of regression analysis which include tests of the inde- 
pendence of the estimated residuals. 

The second point is that additional complications arise if 
there are any exogenous variables in the system which is the 
usual case in economic models. Returning to the complete 
solution of the model given in (43), the limiting value of the 
solution depends on both the disturbance process and the exog- 
enous variables. Thus it is the spectral representation of 
se epe ra. at [Bxt + €!] 
that should be compared with the direct spectrum estimate 
ECan Therefore in order to carry out the test in this case 
the spectrum matrix of the process {xj} must be obtained. 

Once again, however, the test reduces to a check of the abil- 
ey OLmche minlitem sA()) to: reduce Zz, Bx to a sequence of uncor- 
related disturbance vectors. We are thus left with the con- 
clusion that while the spectral representation of the solution 
of a system of equations is of interest for descriptive pur- 
poses, its use for testing the validity of the model appears 

to be rather limited. 


P. Howrey and H.H. Kelejian 


10. 


alts 


v2 


3s. 


14. 


Wye 


Bibliography 


Adelman, Irma §& Frank. ''The Dynamic Properties of the 
Klein-Goldberger Model," Econometrica, (October, 1959), 
5916=(62)51. 


Baumol, W.J. Economic Dynamics: An Introduction. New 
York: Macmillan Co., Toso, 
Chow, G.C. ''The Acceleration Principle and the Nature of 


Business Cycles," Quarterly Journal of Economics, (Au- 
gust, 19168), 4035-418. 


Chow, G.C., and Levitan, RoE.’ “Nature of Businessmevetes 
Implicit in a Linear Econometric Model," IBM Research 
Report RC 2085, Thomas J. Watson Research Center, York- 
town Heights, New York, 1968. 


Christ, C.F. Econometric Models and Methods. New York: 
John Wiley & Sons, 1966. 


Duesenberry, J., et al. The Brookings Quarterly Econome- 
tric Model of the United States. Amsterdam: North- 
Holland Publishing Co., 1965. 

Eisenpress, H., and Greenstadt, J. "The Estimation of Non- 


linear Econometric Systems,'' Econometrica, (October, 
1966), 851-861. 


Evans, M.K., and Klein, L.R. The Wharton Econometric Fore- 
casting Model. Philadelphia: Economics Research Unit, 
Department of Economics, Wharton School of Finance and 
Commerce, 1967. 


Fisher, F. The Identification Problem in Econometrics. 
New York: McGraw-Hill Book Co., lL 


Fromm, G., and Taubman, P. Policy Simulations With an 
Econometric Model. Washington, D.C.: The Brookings Insti- 
EUite), PSGSe 

Goldberger, A.S. Impact Multipliers and Dynamic Properties 
of the Klein-Goldberger Model. Amsterdam: North-Hollan 
Publishing Co., 1959. 


Goldberger, A.S. Econometric Theory. New York: John Wiley 
& Sons, 1964. 


Haavelmo, T. ''The Inadequacy of Testing Dynamic Theory by 
Comparing Theoretical Solutions and Observed Cycles," 
Econometrica, (October. 940) iy silz=5 240 


Hadley, G. Linear Algebra. Reading, Mass.: Addison-Wesley 
Publishing Co., 1961. 


Holt, C. “Validation and Application of Macroeconomic 
Models Using Computer Simulation," 637-650. The Brook- 


Simulation Versus Analytical Solutions Pil 


A 


Ie 


Te 


18. 


OF 


20. 


Zire 


Des. 


DN. 


24. 


Lie 


26. 


ings Quarterly Econometric Model of the United States. 
Edited by J.S. Duesenberry, et al. Amsterdam: North- 
Hollanec Publishing Co., 1965. 

Howreyenn be wecCtabaieizatTOnePOlLicy in Linear Stochas tie 
Systems,'' Review of Economics and Statistics, (August, 
1967), 404-411. 


Howney, E.R.) stochastic Properties of the Kiern-Golld- 
berger Model," R.M. No. 88, Econometric Research Program, 
Princeton Universit tyewboameeton, Nod. , 1967. 


Howrey, E.P. "A Spectrum Analysis of the Long-Swing Hy- 
pothesis," International Economic Review, (June, 1968), 
228-252. 


Johnston, J. Econometric Methods. New York: McGraw-Hill 
Book Gorm.) LIGSE 


Kendall, M.G., and Stuart, A. The Advanced Theory of Stat- 
istics, II. New York: Hafner Publishing Co., 1961. 
Kelejian, H.H. "Two Stage Least Squares and Nonlinear Sys- 


tems,'' Working Paper No. 8, Princeton: Industrial Rela- 
tions Section, Princeton University, 1968. 


Klein, L.R. Economic Fluctuations in the United States: 
1921-1941. New York: John Wiley §& Sons, 1950. 


McManus, M. ''Dynamic Cournot-Type Oligopoly Models: A Cor- 


rection,'' Review of Economic Studies, (October, 1962), 
337-339. 
Nagar, A.L. “Stochastic Simulation of the Brookings Econ- 


ometric Model," paper presented at the San Francisco 
Meetings of the Econometric Society, December, 1966. 


Wold, H. "A Generalization of Causal Chain Models: (Part 
III of a Triptych on Causal Chain Systems) ,'' Econometrica, 
Gprii W960) 445-465" 


Yaglom, A.M. An Introduction to the Theory of Stationary 
Random Functions. Englewood Clitfs, N.J.: Prentice-Hall, 


1962. 


Validation 


Richard Van Horn, Carnegie-Mellon University 


Introduction 


Simulation models change the state of our knowledge or, at 
least, beliefs, about some process. Or, in other words, sim- 
ulators are designed and used with a goal of learning some- 
thing. By all generally accepted definitions, a simulation is 
a symbolic or numerical abstraction of the process under study 
and is not the process itself. Thus "learning" from a simul- 
ation requires two stages. First, we must understand the be- 
havior of the simulator itself in terms of the relations that 
exist between inputs and results. The second, and often more 
difficult task, involves translating "learning" from the sim- 
ulation to "learning" about the actual process. The second 
task, the translation of learning from the simulator to the 


actual process is generally viewed as the focus of the valida 


tion process. Fishman and Kiviat [11] divide simulation test 
ing into three categories. (1) Verification insunesmtnaiaed 

simulation model behaves as an experimenter intends. (2) Val- 
idation tests the agreement between the behavior of the simu- 
lation model and a real system. (3) Problem analysis embraces 
statistical problems relating to (the analysis) of data gen- 

erated by computer simulation." 

This paper starts with the assumption that a set of "statist- 
ically significant" inferences are available from a simulation. 
The random number generator is truly random and the computer 
program correctly executes the logic desired by the modeler. 

The modeler has examined run length, replication, variance- 
reduction techniques and related problems of operation. Fin- 


ally, the results have passed adequate tests of statistical 


Validation Zo) 





significance. These areas are clearly difficult and important, 
but they are outside the scope of the current discussion. 

Good discussions and surveys of these problems are found in 
Conway [5], Fishman and Kiviat [11,12], and Naylor [15,16]. 

Validation, in this paper, is the process of building an ac- 
ceptable level of confidence that an inference about a simulated 
process is a correct or valid inference for the actual process. 
Seldom, if ever, will validation result in a "proof" that the 
simulator 2S a correct or “true” model of the real process. A 
simulator on a digital computer is a particular instance of a 
finite-state machine that will transform inputs into outputs. 
Thus Turing's proof [19] on the equivalence of finite-state 
machines implies that one can never prove that two machines are 
identical just by comparing input-output transformations no 
matter how large a (finite) sample is used. 

Fortunately, the users of simulators are seldom concerned 
with proving the "'truth'' of a model. (The work of Newell and 
Simon [17] on simulation of human thought is an exception to 
this rule.) Instead, the simulator produces some specific in- 
Sight which needs validation. Often, a large number of actions 
-- for example, the undertaking of various statistical tests, 
special data collection efforts, complementary studies and field 
tests -- will exist each of which may increase (or decrease) our 
confidence in the specific insights. One approach to valida- 
tion is to list a number of possible actions; however, no one 
wants the experimenter to take all possible actions. Some are 
inapplicable; others are duplicative, and all involve a cost. 
The experimenter is supposed to select a set of actions. Thus, 
in concept, at least, validation reduces to a standard decision 
problem -- to balance the cost of each action against the value 
of increased information about the validity of an insight. In 
this view, two important characteristics of the validation pro- 
blem are: 


1. The objective is to validate a specific set of insights 
not necessarily the mechanism that generated the insights. 


234 Richard Van Horn 


2. There is no such thing as "the" appropriate validation 
procedure. Validation is problem-dependent. 


Validation clearly applies to a far more general environment 
than simulation; validation is a problem associated with all 
modeling. It seems appropriate to ask, "Why even discuss val- 
idation (or statistical analysis, etc.) in the context of sim- 
ulation?" Several reasons come to mind. Simulations tend to 
become far more complex than other Management Science models. 


Most analytic models either deal with small problems -- for 
example queuing models -- or deal with common parts of large 
problems -- input-output models. Simulators allow the modeler 


to include many different parts and processes in one model and 
allow the parts to interact in non-linear, non-stationary modes. 
In addition, simulators conceal their assumptions and pro- 
cesses, certainly from the casual observer, and often from their 
designer. The simple statement that model x is a linear pro- 
gramming model conveys a great deal of information about its 
structure, assumptions, and limitations. The statement that 
model y is a simulation conveys virtually no information. Fin- 
ally simulators, either explicitly or implicitly, often claim 
to represent "reality.'' Economists would not claim nor would 
managers believe that the set of differential equations from 
the theory of the firm represented the firm's actual decision 
process. But simulators look real and both modelers and managers 
find them easy to believe. For all of those reasons, valida- 


tion holds a special and important role in simulation. 


Aspects of Validation 


Two broad questions appear relevant to building a framework 
for validation. First, what are the characteristics of the 
processes or systems that are common to the management and 
social sciences? Second, what methodologies or approaches 
should enter into validation? 

Earlier in this paper validation was characterized as pro- 


blem-dependent. Thus, the major attributes of the processes 


Validation 2515 





that are simulated should guide any general discussion of val- 
idation. Clearly, the entire spectrum of problems of interest 
to simulation modelers in the management and social sciences is 
very broad. For example, models frequently are built to sim- 
ulate the behavior of complex problems such as queuing systems 
under a set of explicit mathematical assumptions. When the 
actual process is specified by the modeler, the validation pro- 
blem as defined here does not exist. Simulation of intelligent 
behavior represents the other extreme. Here the task of find- 
ing a satisfactory description for the actual process is immensely 
more difficult and controversial than comparing the actual to 

a simulator. 

Most management science simulations involve a production or 
service facility -- for example hospitals, job shops, transit 
systems and air traffic -- or perhaps involve some aspect of 
the economy. In general these systems are characterized by the 
following: 


1. The structure and parameters of the process are determined 
by the environment not by the modeler. 


2. Part of the process depends upon physical phenomona -- 
the behavior of aircraft, trains or drill presses. 


3. People are part of the process either directly as infor- 
mation-processors and decision-makers or indirectly as 
consumers. 


4. The process tends to consist of many parts and the be- 
havior of the process depends on interaction between the 
parts. 


Much of this discussion is relevant to other forms of simula- 
tion -- for example engineering or physics simulation of atomic 
particles, satellite orbits, etc. -- but the primary focus is on 
problems suggested by the above environment. 

After examining philosophical views on validation, Naylor and 
Finger [15] suggest a three stage approach. In a slightly gen- 
eralized form, the three phases are: 


1. Construct a set of hypotheses and postulates for the pro- 
cess using all available information -- observations, 
general knowledge, relevant theory and intuition. 


2. Attempt to verify the assumptions of the model by subject- 
ing them to empirical testing. 


236 Richard Van Horn 


3. Compare the input-output transformations generated by the 
model to those generated by the real world. 


These three phases appear to capture the major ways to build 
confidence in a model and the subsequent discussion will fol- 


low this structure. 


Model Construction 


Management Science processes, as described, consist of three 
Main components: people, physical processes, and an organiza- 
tion structure. The simulator must find some way to represent 
these components in the model and these models will posess vary- 
ing degrees of a-priori confidence. When a process is easy to 
observe and measure, the confidence in its representation is 
high. Some representaticn will be 'well-known" in the sense 
that some previous validation has occured. For example, it is 
often possible to represent machines and physical processes by 
production-functions with a substantial degree of confidence. 

For many processes, model confidence is increased by the 
existence of an extensive body of research. Some examples are 
Highway Traffic, Air Traffic, Job Shop, Elevator Operation, 
Machine Failures, and Telephone Switching. For many of these 
activities, the mathematics of stochastic processes provide 
a strong theoretical base for modeling even though direct anal- 
ytic solutions are not known. 

Clearly, people are harder to model. In a stationary world, 
probability distributions provide reasonable models for arrivals 


at a supermarket or subway stop. Models to describe behavior in 


a non-stationary situation -- the opening of a new supermarket 
or a subway fare raise -- are harder to find. Now the modeler 
faces questions of utility or preference. In many systems, 


overall system behavior is strongly influenced by people acting 
as decision-makers or information processors and at this point, 
models become extremely scarce or at least controversial. The 
satisficing, limited-capability man of March and Simon [18] 


appears, on the surface, to differ greatly from the rational, 


Validation Bak 


nn ee UU UE enn UIE SIDES NSS 


optimizing Economic man. If the modeler accepts the March- 
Simon view, he can with great effort construct a model of a 
specific type of man; but an operable general model has yet to 
appear. (For example, Clarkson [2] devised a model of a trust 
investment officer.) 

Unfortunately, man's intellect beclouds even the production 
functions for his physical activities. A number of researchers 
have observed that jobs take longer when there isn't much work 
to be done -- the so-called job-shop effect. Cyert and March 
[7] generalize this effect to a concept that they call organ- 
izational slack. However, the problem remains the same; good 
models for human behavior are hard to find. 

The above implies that the initial confidence attached to 
most representations of human behavior will tend to be low. 
Subsequent validation by empirical testing of assumptions or 
input-output transformations tends to involve unwieldy statist- 
ical properties and, in general, is difficult and costly. The 
scarcity of empirical work testing the March-Simon postulates 
offers some evidence of the difficulty of verifying models of 
man as a complex-information processor. 

These problems of finding adequate symbolic models of human 
behavior have led a number of investigators to man-machine or 
game simulations. Man-machine simulations attempt to solve 
the representation of human behavior by inserting a person 
directly in the simulation. The SAGE research by RAND ( and 
later by the Systems Development Corp.) is now a ellassh GRex— 
ample of this solution. The SAGE man-machine simulation sub- 
sequently was used for training but its original purpose was 
research with a high confidence representation of human behav- 
ior. In the man-machine simulations of the RAND Logistics 
Simulation Laboratory, men again were introduced explicitly 
because an adequate model of their behavior was unavailable [9]. 

Although one can question whether a man in a simulation is 
a valid representation of a man in a different process, most 


people will agree to placing higher a-priori confidence on a 


238 Richard Van Horn 





man than on most models of him. However, this solution raises 
a host of new problems. Substantial time compression is ruled 
out; most people find microsecond operating cycles beyond their 
abilities. Since data produced by people are costly and noisy, 
severe problems arise when one attempts to draw "statistically 
Significant" inferences from the simulation itself. Thus man- 
machine simulation trades increased validity for a loss in 
analysis capability. 

In most situations people and physical processes are linked 
by information and decision flows -- an organization structure. 
A common simulation model for an organization is a simple noise- 
free network for instantaneous transmission of discrete mes- 
sages. There is general agreement that real organizations are 
more complex, but again there are no widely accepted complex 
models. One obviously can improve face validity by adding error 
and delay mechanisms to the simple network model. Bonini [1] 
constructed an elaborate organizational model with motivational 
factors and interactions between human behavior and physical 
processes. However, Bonini used his model to explore the con- 
sequences of a set of postulates; he did not argue that it was 


a valid general representation of an organization. 


Empirical Testing of Assumptions 


The notion of subjecting assumptions, parameters and distri- 
butions to empirical testing appears eminently reasonable. The 
statistical theory of estimation and hypothesis testing provides 
a rigorous approach to this task. A model with untested, un- 
testable, or refuted assumptions is at least disturbing. Most 
articles on simulation applications report some form of assump- 
tion testing even if it is only an eyeball comparison of means 
and ranges. And articles on validation (the few that exist) 
offer a list of appropriate statistical tests. Some degree of 
assumption testing appears essential to validation. 

Two qualifications to this testing deserve mention. First, 


finding a "genuine" underlying distribution or assumption to 
g 8 ying 


Validation 239 


ead 


observe is a non-trivial task. For example, consider a manual 
order-processing system that is scheduled for conversion to a 
computer system with remote terminals. It is reasonable to 
assume that orders arrive randomly and are simulated by a 
model that uses the average arrivals per time-unit as the mean 
for a Poisson distribution. If the underlying arrival distri- 
bution is essentially random but the manual system collects 
and forwards orders in batches, the data collected at the 
order receipt point may show a variance to mean ratio that is 
significantly larger than one. The data thus refute the Pois- 
son assumption. However, if the computer system eliminates 
batching, the Poisson assumption is valid; and the data (as 
interpreted) are wrong. This type of problem is common in in- 
ventory and service systems. One suspects that a whole array 
of similar problems lurk in the world. 

Second, empirical testing of assumptions often has a lower 
cost substitute -- sensitivity testing. A number of results 
exist in statistics and probability that are true for classes 
of distributions or even for a general distribution. It seems 
reasonable to expect that often the insight gained from a sim- 
ulation will not depend on a specific distribution. In a sim- 
ilar fashion, an insight normally is relevant for a range of 
parameter values. Sensitivity testing can establish the set 
of distribution and parameter values for which a set of in- 
sights is relevant. In this way, the requirement for and cost 
of testing assumptions empirically is reduced. In addition, 
the model insights now apply to a much broader set of processes 
and presumably are of greater value to the research community. 
Conway [6] raises the question of whether researchers (not 
implementers) should ever become involved with empirically-based 
simulators. 

Since many simulators do focus on an actual process, the 
need to test assumptions does arise. The papers by Naylor and 
Finger [16] and Fishman and Kiviat [11] provide a good review 


of statistical tests of means and variances, analysis of vari- 


240 Richard Van Horn 





ance, regression, factor analysis, spectral analysis and auto- 
correlation, chi-square, and non-parametric tests. Both 
articles point out that all statistical tests make some assump- 
tions about the nature of a process. Thus the tests themselves 
are subject to questions of validity. 

Some tests require fewer assumptions than others, but in 
general the power of tests will decrease as one relaxes assump- 
tions. Mood and Graybill [14] point out that for comparing 
sample means from a normal population, the (non-parametric) 
Mann-Whitney test has an asymptotic relative efficiency equal 
to 95% that of a t test, "a small price to pay when the assump- 
tion of normality is suspect." Thus for large samples, at 
least, some validity problems again can be reduced by reducing 
the dependence on assumptions. In the same spirit, Fishman 
and Kiviat suggest a variance test described by Cochran [3] 
to test goodness-of-fit without the need to assume class inter- 
vals. Perhaps this area is summed up best by a Fishman and 
Kiviat statement (or understatement) that "(statistical) val- 
idation, while desirable, is not always possible." 


Comparison of Input-Output Transformations 


A digital simulator, as referenced earlier, is a finite- 
state machine for transforming an input set into an output set. 
Since insight comes from observing and analyzing the transfor- 
mation, overall confidence in the insight clearly depends 
greatly on confidence in the transformation process. One ob- 
vious way to gain confidence is to compare the output of the 
simulator and actual process using, if possible, identical 
input. Many of the statistical tests discussed in the pre- 
vious section are again relevant for this comparison as are 
the limitations. Often simple comparisons of means, ranges 
and variances and graphical comparison of distribution or time 
behavior will capture most of the available information. 


Validation AG 


ETE EEE Stns Sn 


Since simulation produces a set of time series, methods that 
look at time series appear particularly appropriate. One of 
the more interesting suggestions is the use of spectral anal- 
ysis [10,13,15]. Many simulation outputs are auto-correlated 
and spectral techniques provide the auto correlation character- 
istics in a convenient form for analysis and comparison. Jen- 
kins [13] and Fishman and Kiviat [10] describe procedures to 
test the equivalence of two spectra. If the spectra are e- 
quivalent, the modeler certainly increases his confidence in 
the model. If the spectra are not equivalent, the interpreta- 
tion is more difficult. Often the relation between any devia- 
tion in spectra and confidence in a particular insight is un- 
eile aims 

Spectral analysis faces several other problems. First it 
requires a large number of observations. The cost of data 
collection on an actual process or a man-machine simulator may 
preclude obtaining a sufficient sample for the use of spectral 
techniques. Another requirement is even more EeEsStrilceuves | eLhie 
procedures described above apply to "covariance stationary" pro- 
cesses. But many simulators are designed precisely because the 
process under study is non-stationary. For example, consider 
a simulation of a computer center. At 8:00 a.m. the system 
starts off with a zero backlog of priority jobs (jobs submitted 
by the research staff). The expected length of the backlog may 
increase during the day until 5:00 p.m. when the staff goes 
home. The center reduces the backlog to zero and then runs 
background or sells time. 

Thus the process never reaches steady-state. The backlog 
statistics are direct functions of clock-time and are certainly 
not covariance-stationary. Problems with some sort of start- 
stop phenomena or time-varying parameters abound in the simu- 
lation environment. 

Ideally a comparison test should handle non-stationarity, 
compensate for noisy data, simultaneously evaluate a number of 
output measures and work for small samples. Does such a test 


exist? The answer is yes if one is willing to define "test" 


242 Richard Van Horn 
very broadly. The test is simple. Find people who are directly 
involved with the actual process. Ask them to compare actual 
with simulation output. To make the test a little more reput- 
able, one might offer several sets of simulated data and sev- 
eral sets of actual data and see if the "experienced" people 
can tell which is which. One might even test the classifica- 
tion for statistical significance. If people can discriminate, 
ask them how they do it. The experimenter can then decide if 
the detectable difference affects the inferences that he wishes 
to make. 

This test is sometimes attributed to Turing although Turing 
himself might not have wanted to claim it. Turing [20] actually 
was trying to find an operational definition of human intelli- 
gence when he suggested a similar procedure. The idea is cer- 
tainly appealing and deserves further exploration. It is pro- 
bably a great improvement over having the modeler use his in- 
tuition to validate his model. However, whether further work 
will lead to any general conclusions on the power of the test 
is doubtful. 

Assume that a model has passed a reasonable set of tests for 
input-output equivalence with an actual process. Often the 
modeler has little interest in the tested situations -- those 
represented by the empirical data. One reason for building 
simulators is to explore situations for which no empirical data 
exist. In this event, the inferences represent extrapolation 
from the experience base. The experimenter must now ask whether 
his insight applies to a property of the actual process or mere- 
ly to a peculiarity of the simulation that only effects the ex- 
trapolated situation. There is no answer to this question in 
the simulation situation. If the modeler wishes to further in- 
crease confidence, he must look outside. 

Complementary research offers one path to further confidence. 
The basic idea is to re-state the questions that led to the sim- 
ulation result in a different context. For example if the sim- 
ulation leads to insight that appears strongly dependent upon 
human behavior, the next step might be to conduct psychological 


Validation 243 
Neen 
experiments. Physical scientists and engineers have long been 

accustomed to this sequence. Theoretical or abstract model 
results are tested in a series of small experiments. If all 
goes well, large-complex experiments are attempted. Despite 
extensive, rigorous, validity testing, the design for a new 
aircraft does not go from paper to production. Instead, it is 
subjected to a great deal of complementary research. 

Complementary research appears less common in the manage- 
ment and social sciences. For example, the notion of building 
prototype information or management systems specifically for 
research deserves serious attention in view of the large ex- 
penditures that go into such systems. Dunlop [8] reports that 
IBM is engaged in some activities of this nature but certainly 
no widespread patter is visible. For this discussion, a pro- 
totype is defined as an iconic model -- often a simplified ab- 
straction of an actual process -- operated explicitly for re- 
search and testing purposes. 

A more common activity is the field-test. The field-test 
places an "actual process" in an operational situation and 
tries to measure performance. In many field-test situations, 
the operational decisions are dominant. At the first sign of 
real or imagined difficulty, experimental controls and data col- 
lection are abandoned or seriously compromised. As a prelude 
to implementation, field tests undoubtly are valuable. But 
given their many problems, their usefulness as a representa- 
tion of reality is dubious. If simulation is the last resort 
of the analyst; then field tests are the last resort of the 
validator. 


A Validation Example 


Validation clearly does pose a large number of problems. 
The experiences encountered during Laboratory Problem Four 
(LP-IV) in the RAND Logistics Simulation Laboratory illustrate 
a number of the points discussed in the previous sections. 


(Cohen and Van Horn [4] provide a more complete description 


244 Richard Van Horn 


ee  __ 


of the LP-IV experiment.) LP-IV centered around a man-machine 
simulation of aircraft recovery operations at an Air Force Base. 
In many aspects, the problem resembles a complex job shop. Air- 
craft, as a result of alert and training activities generate 
both planned and unplanned maintenance demands. Planned activ- 
ities include preflight and postflight inspections and fueling. 
Unplanned jobs result from problems encountered inflight or 
during ground inspections. Performance of the jobs requires 

men from over twenty different skill areas plus extensive equip- 
ment and facilities. 

The objective of the project was to achieve more effective 
use of aircraft by reducing turnaround time -- the time from 
landing of an aircraft until it is ready for its next flight. 
The mechanism for this reduction was limited to changing the 
management system -- scheduling and control procedures. Re- 
source levels and the production-functions for resources were 
viewed as fixed. 

A number of previous RAND studies had examined the mainten- 
ance and operating characteristics of aircraft so that a rea- 
sonable research base was available for construction of envir- 
onment models. Preliminary studies of existing Air Force 
scheduling and control indicate a high degree of complexity and 
a lack of any recognizable structure. The process has a very 
strong human information-processing and ad hoc decision-making 
element. None of the common "dispatch rules" from job-shop 
studies appeared to capture more than a small part of the pro- 
cess. Initial runs of a computer simulation of the process 
strengthened this view. In addition, Air Force controllers, 
who were contacted during a number of field visits, expressed 
great doubt that control people would or even could respond to 
some of the contemplated changes. At this point, the LP-IV 
staff decided that introducing actual Air Force Controllers into 
a man-machine simulation was required to achieve reasonable con- 
fidence in any insights. 

The staff further agreed to produce a special validation 


Validation 245 


SS ES 


run -- the Benchmark. The Benchmark combined a simulated (icon- 
ic) version of the existing information system and control pro- 
cedures with the computerized environment models of aircraft 
and maintenance men, equipment, and facilities. This step pro- 
vided the mechanism for direct comparison of output between 

the simulation and the actual process. In contrast to a sim- 
jlar all-computer run of this nature the cost was high. About 
four man-years of our development effort, two-man months of 
Air-Force time and twenty thousand dollars of computer time 
went into this effort. (Exact cost apportionment is dificult 
because a substantial part of the effort carried over to later 
phases of the experiment.) Previous RAND Laboratory Project 
studies (LP I, II and III) did not include a Benchmark. One 
modeled a non-existant future environment and the other two 
modeled inventory processes for which initial confidence in the 
model and estimation processes were believed high enough that 

a Benchmark was not needed. 

The first organization structure for the Benchmark was a 
simple noise-free network with constant delays. For example, 
the controller received information on maintenance needs fif- 
teen minutes after an aircraft landed. This model subsequently 
failed several validity tests -- it was simply too good. After 
another round of field observation and data collection, it was 
replaced with a model that introduced random delays and random 
errors into the information and decision flows. Under the new 
mode, some jobs were reported long after the aircraft landed, 
occasionally a team that was sent out to work on a job, didn't 
go and so on. 

Despite the extensive experience, even the aircraft-mainten- 
ance environment models presented a challenge due to inadequate 
data. The Air Force at that time collected "total man hours" 
for each job. The models used team size and elasped time. 

Part of the problem was resolved by special data-collection but 
some areas were not covered. For these areas, senior Air Force 


maintenance people were asked to convert a stratified set of 


246 Richard Van Horn 





total-time observations into their elasped time and team-size- 
equivalents. These estimates were then used by the LP-IV staff 
to construct the sampling distributions. 

The LP-IV environment did not provide a very good basis for 
testing assumptions. For some parameters such as the ones de- 
scribed above, empirical data was unavailable. When data was 
available, the empirical distributions (with minor smoothing 
and truncation) were used directly. However, a critical envir- 
onment assumption -- the relation between various aircraft ac- 
tivities and resulting maintenance workload -- did look test- 
able. Existing Air Force planning policies relate workload to 
flying-hours. Some convincing a-priori arguments and our dis- 
cussions with our Air Force advisors suggested that the act of 
flying -- the sortie -- regardless of flight length, was the 
prime generator of workload. 

In the LP-IV model, workload does generate directly from the 
sortie. Aside from some limited special tests, only aggregate 
data -- total man hours, sorties, and flying hours -- were a- 
vailable. Regressions on this data produced only discourage- 
ment. The constant term was large and r? was small. Further- 
more sorties and flying hours are highly correlated. Part of 
the problem is explained by the fact that the correlation be- 
tween available manhours (a manhour measure in the accounting 
system) and expended man-hours was high. One inference of these 
results is systematic bias in the data. Reported manhour ex- 
penditures apparently are tailored to fit available manhours -- 
a not surprising phenomenon. One might at this point feel forced 
to reject the model's assumptions. However, it is important to 
look back at the insight that LP-IV is after -- the prediction 
and improvement of aircraft turnaround. For this purpose, a 
model that relates workload to time available is useless and in- 
appropriate. This relation probably exists only in the account- 
ing system. 

Fortunately, LP-IV obtained (at high cost to several staff 


members) a special set of data with two types of sorties, one 


Validation 247 





approximately three times the length of the other. These data 
show a much stronger relation of workload with sorties than with 
flying hours. The sorties with a three times increase in fly- 
ing show only a 25% increase in workload. Although the true 
situation is far from clear, we conclude that the assumption of 
workload related only to sorties was reasonable for our partic- 
ular purpose. We also conclude that one should use empirical 
data with extreme care and a large amount of skepticism. 

The Benchmark model ran for five simulated weeks. For input- 
output comparison, the staff obtained four months of data from 
an Air Force Base with a flying program that was almost identical 
to the laboratory. These data encompass 130 sorties for the 
laboratory and over 500 for the actual process. Unfortunately 
the actual workload data is four observations of average man- 
hours per sorties for each of twenty skill groups. The strong 
tendency of real world data systems to aggregate data has re- 
peatedly frustrated our attempts to conduct a reasonable vali- 
dation. 

Within the limits of the data, the comparisons of the work- 
load for the laboratory run with the four actual observations 
look reasonable. The largest deviation is 20 for one of the low 
workload skill groups and over half the skill group means agree 
within one standard deviation. Air Force bases place consider- 
able emphasis on meeting the take-off schedule. Take-off devia- 
tions in the environments we were examining are rare but do 
exist; the lab model showed the same behavior but the sample is 
too small for any meaningful statistical analysis. 

The primary focus of LP-IV was turnaround. Unfortunately, 
aircraft turnaround data were not and still are not collected. 
Special data collection for turnaround, which would have re- 
quired three people for at least a month at a base, was rejected 
as too expensive. 

Since LP-IV was a man-machine simulation, with experienced 
Air Force participants, some form of "Turing" test has great 
appeal. Thus at the end of the Benchmark, the ten Air Force 


participants were shown the data and extensively questioned on 


248 Richard Van Horn 





their reactions to the data and to the experiment itself. They 
strongly reported that turnaround times were consistent with 
their actual experience. They also pointed out many problems 
including (1) the lack of noise in the organization structure, 
(2) restriction with concurrent work on aircraft and (3) inade- 
quate representation of service activities -- fueling, towing, 
and inspecting aircraft. These were corrected to their satis- 
faction. 

Subsequent runs examined policy innovations centered around 
the introduction of new scheduling procedures and modified in- 
formation system. Results were examined by analysis of vari- 
ance using an F test. The mean time for turnaround dropped by 
39% below the Benchmark (significant at the .001 level). The 
same was true for several other important measures. Thus the 
simulation itself did generate statistically significant results. 
But in view of the validation difficulties, LP-IV concluded that 
the confidence that these results held for the actual process 
was still too low. 

Two efforts to increase confidence were investigated. The 
first was a series of psychological experiments to determine 
whether the dramatic improvement in scheduling was related to 
some accident of the simulation or whether it was reproducible 
in a different environment. The psychological experiments used 
forty college students with no knowledge of the Air Force ver- 
sion of the scheduling task. The same results appeared and the 
difference between scheduling modes was again statistically sig- 
nificant. This experience certainly did increase the confidence 
that the difference in scheduling systems was valid for the 
actual process. 

For a second complementary research effort, an Air Force group 
and the LP-IV staff decided to conduct a field test. In terms 
of the earlier discussion, LP-IV preferred a prototype test, but 
the Air Force could not obtain agreement for any control over or 
interference with operational requirements. During the first 
month, the field-test did show a large (and significant) reduc- 
tion in turnaround. At this point, the Controllers complained 


Validation 249 





about their workload (the requirement to preserve operational 
integrity resulted in much duplicate work) and the procedures 
were modified. The test went on and a great deal of useful 
data was generated. However, the procedural changes plus sev- 
eral major changes in operations make any direct comparisons 
of questionable value. 

One suspects that LP-IV does not represent an isolated inci- 
dent in the simulation world. The difference between looking 
for a problem which fits a technique and fitting techniques to 
a given problem is well-known. LP-IV started with a problem and 
tried to solve it. Many validation techniques, particularly 
statistical ones, appeared highly desirable but practical lim- 
itations precluded their use. One, of course, can greatly 
improve the picture presented here by selective reporting; but 
perhaps this one is more indicative of the world. Should one 
give up at this point? LP-IV chose to go ahead and tried a var- 
lety of ways to achieve confidence in the results. In retro- 
spect, some of us feel that too little of the total effort went 
into validation. For example, the decision not to incur the 
cost of collecting turnaround data was reversed for subsequent 
work in this area. So, hopefully, some of the mistakes in LP- 
IV will have long-run benefit toward improving validation abil- 


TEAS 


Concluding Remarks 


Simulation offers the most flexible and realistic representa- 
tion for complex problems of any quantitative technique. Its 
look of realism makes it a frequently preferred technique for 
large significant problems. Thus, many of the aspects that make 
validation difficult for simulation also give validation a great 
deal of importance. Decisions, often major decisions, are made 
on the basis of simulation results. 

Knowledge about appropriate statistical tests for validating 


Simulations is increasing. However, testing suffers from the 


250 Richard Van Horn 


standard problems of empirical research: (1) small samples due 
to high cost of data (2) too aggregate data and (3) data whose 
own validity is questionable. 

When adequate data is available, statistical tests are an 
essential part of validation, but the overall validation process 
should encompass much more. In rough order of decreasing value- 
cost ratios, some of the possible validation actions are: 


1. Find models with high face validity. 

2. Make use of existing research, experience, observation 
and any other available knowledge to supplement models. 

3. Conduct simple empirical tests of means, variances, and 

distributions using available data. 

Run luring! Sty pe stesiesr 

Apply complex statistical tests on available data. 

Engage in special data collection. 

Run prototype and field tests. 

Implement the results with little or no validation. 


CyNAANHL 
ate) MBs se 


The real task of validation is finding the appropriate set of 
actions. Hopefully, this discussion sheds some small insight 


on that activity. 


Bibliography 


1. Bonini, Charles P. Simulation of Information and Decision 
Systems in the Firm. Englewood Cliffs, N.Jjs) Premesce- 
Hala coo Se 


2. Clarkson, G.P.E. Portfolio Selection: A Simulatironges 
Trust Investment. Englewood Cliffs, N.J.: Prentice-Hall, 


InceawelSoZe 

3. Cochran, W.G. "Some Methods for Strengtheming the Common 
x2 Test," Biometrics, X (4) (December, 1954). 

4. Cohen, I.K., and Van Horn, R.L. "A Laboratory Experiment 


for Information System Evaluation," Information System 
Science. Edited by J. Spiegel and D. Walker. New York: 
Spartan’ Books’, Inc , 19605" 


5. Conway, R.W. "Some Tactical Problems in Digital Simula- 
tion,'' Management Science, X (1) (October, SG) ee 


6. Conway, R.W. An Experimental Investigation of Priorit 
Assignment in a Job Shop. Santa Monica, Calif.: The RAND 
Corporation, RM-3789-PR, 1964. 


7. Cyert, R.M., .and March, JiGo. “A Behavaorall theory, of the 
Firm. Englewood Cliffs, N.J.: Prentice-Hall, Inc., 91965. 


Validation Zo 


8. Dunlop, R.A. "Some Empirical Observations on the Man- 


Machine Interface Question," Proceedings of the 1968 
Carnegie-Mellon Symposium on Management Information 
Systems (to be published) . 

9. Geisler, M.A., Haythorn, W.W., and Steger, W.A. Simulation 
and the Logistics Systems Laboratory. Santa Monica, Cal.: 
The RAND Corporation, RM-3281-PR, 1962. 

10. Fishman, George S., and Kiviat, P.J. “The Analysis of Sim- 
ulation-Generated Time Series,'' Management Science, XIII 
(March, 1967). 


11. Fishman, George S., and Kiviat, P.J. Digital Computer Sim- 
ulation: Statistical Considerations. Santa Monica, Cal-.: 


The RAND Corproation, RM-5287-PR, 1967. 


12. Fishman, George S. Digital Computer Simulation: Input- 
Output Analysis. Santa Monica, Cal.: The RAND Corpora- 
tion, RM-5540-PR, 1968. 


13. Jenkins, G.M. ''General Considerations in the Analysis of 
Spectra," lechnometrics, Ii (May, 196i): 


14. Mood, A.M., and Graybill, F.A. Introduction to the Theor 
of Statistics. 2nd edition. New York: McGraw-Hill Book 


Comemloooe 
15. Naylor, Thomas H., and Finger, J.M. ''Verification of Com- 


puter Simulation Models," Management Science, XIV (Octo- 
ber L967)) 5.92 Ode 


16. Naylor, Thomas H., Wertz, K., and Wonnacott, Thomas H. 
"Methods of Analyzing Data from Computer Simulation Ex- 
periments,'' Communications of the ACM, X (November, 1967), 
/OASF/ Me 


17. Newell, Allen, and Simon, H.A. "GPSS, A Program that Sim- 
ulates Human Thought," pene and Thought. Edited by 
E.A. Feigenbaum and J. Feldman. New York: McGraw-Hill 
Book (Gon, L963). 

18. Simon, H.A., and March, J.G. Organizations. New York: John 
Willey & Sons, 1958. 


19. Turing, A.M. "On Computable Numbers, with an Application 
to the Entscheidungs Problem,"' Proceedings of the London 


Mathematics Society, XLII nad XLIII (1936 and 1937), 
230-265, 544. 


20. Turing, A.M. "Computing Machinery and Intelligence," Mind, 
LIX (October, 1950), 433-460. Reprinted in Computers and 
Thought. Edited by E.A. Feigenbaum and J. Feldman. New 
York: McGraw-Hill Book Co., 1963. 


Monte Carlo Techniques: 
Theoretical 


D. C. Handscomb, Oxford University 


Introduction 


The class of variance-reducing techniques that I shall dis- 


cuss in this lecture are those, otherwise known as ''Monte 
Carlo" techniques, which rely not on statistical analysis of 
the input and output variables of a simulation but on reor- 
ganization of the simulation itself. In metaphorical terms 
we can say that, whereas statistical analysis looks at the 
"black box" (the simulation) from the outside only, Monte 
Carlo techniques get inside the "black box" and tamper with 
the machinery. 

This often means that the system under investigation is 
no longer simulated as closely as it might have been in 
every particular, which may at first sight appear to repre- 
sent a pointless waste of information. One should bear in 
mind, however, that the object of a simulation is usually 
presented before we start, and if it is to estimate certain 
specific parameters of a system as precisely as possible, 
then the information that we throw away in order to achieve 
this object is of minor importance. 

(There is, however, a point that we have to discriminate 
carefully between those features of the system whose be- 
havior we expect to be upset by the Monte Carlo technique 
and those we expect to be unaffected, since we must ensure 
that only the latter are used for validation purposes.) 

As understood in the Monte Carlo field, the term "variance 
reduction" implies more than the mere reduction of variance. 

Suppose that we have a simulation process involving a 


random element at some stage, so that the output variables 


Monte Carlo Techniques: Theoretical ZS 





have some statistical fluctuation, and that we want to esti- 
mate the expectation of each output variable as precisely as 


possible (or as precisely as the general accuracy of the sim- 


ulation warrants). Let one of these output variables X have 
expectation EX and variance VX. If we perform n independent 
replications of the simulation, giving outputs Xy5Xo,---5X), 


then the average 


me tee Xin 


will have expectation EX = EX and variance VX = VX/n. Thus 
by using X in place of X we have reduced the variance by a 
factor of 1/n by doing n times as much work. Since this ex- 
pedient is always open to us, we can regard variance and work 
as mutually convertible, and so are led to define the effi- 


ciency with which a simulation estimates a parameter by 
efficiency = 1/(variance X work). 


A technique is called variance-reducing in the Monte Carlo 
sense if it increases the efficiency; that is if it reduces 
the variance proportionately more than it increases the work 
involved. 

It is easy to see that repeating a simulation n times in- 
volves n times as much work; it is not so easy to gauge the 
relative amounts of work involved in two different simulations, 
as we must do in order to compare their efficiencies. The 
ultimate measure of work is of course computing cost, but there 
are too many variable factors between the mathematical formu- 
lation of a technique and its realization on a computer, fac- 
tors such as the details of the problem and the peculiarities 
of the computer and its programmers, for us to be able to use 
computing cost as a basis for theoretical discussion. What 
we usually do instead is te settle on some fairly fundamental 
step of both simulations and to represent the work by the num- 
ber of times that this step is performed. 

Since we have chosen to regard variance and work as inter- 


changeable, we can call a technique variance-reducing which 


254 D.C. Handscomb 





does not reduce the variance at all in the usual sense, pro- 
vided that it saves work, as the following example illustrates. 


An Example 


Suppose that we have a model of the typical firm and its 
policies of expansion, recruitment, and so on, and of the rel- 
evant sections of the economy. One straightforward approach 
would be to simulate the working of such a firm from its set- 
ting up until its bankruptcy, recording the size of its labour - 
force at the last stage. By repeating this a few times we get 
a sample of bankrupt firms from which we obtain an estimate of 
the average number of workers who lose their jobs. 

The obvious drawback to this approach is that under any 
reasonable policy a firm has a high probability of remaining 
in business for a very long time, or even indefinitely. Even 
discounting the latter possibility we must still expect to 
have to do a great deal of work in obtaining our sample. 

Now suppose that we bias the probabilities in the direction 
of financially unfavourable events. If we make the bias 
strong enough we can make eventual bankruptcy certain, and 
the time taken to reach it much shorter, so that (on the rea- 
sonable assumption that the work involved in simulating the 
life of a firm is proportional to the time that it stays in 
business) we can get a sample of the required size with far 
less etstomt. 

To compensate for the bias we must give each sample a 
weighting factor. This will increase the work slightly, and 
probably the variance also, but (we hope) not enough to out- 
balance the saving we have achieved. 

This example illustrates a definite advantage that simula- 
tion enjoys over the statistical observation Of Teadeeerer 
In addition to being able to examine dangerous, expensive or 
illegal operations with impunity we can, by distorting the 
probabilities, examine the outcomes of rare events without 


having to sit down and wait for them to occur, we can, so to 


Monte Carlo Techniques: Theoretical 255 


speak, produce miracles to order. 


Techniques 


All the techniques I shall discuss appear in Hammersley 
and Handscomb [2] (principally in Chapter 5). The emphasis 
of this book is not on simulation, however, and we exhibited 
many of the techniques only by applying them to the evalua- 
tion of the integral. 


ers aa 
nee Xe) \ebe 


so that readers may have gained the impression that these 
techniques work for simple mathematical problems only, and 
not for complicated simulations. 

I shall try to dispel this impression by showing how vari- 
ance-reducing techniques apply to simulation also. I cannot, 
however, predict which techniques will prove effective in 
every given situation; in practice one has to proceed by more 
or less inspired trial and error, learning by experience which 
tools serve one best. My point is just that we chose the 
evaluation of an integral as our example purely to make the 
mathematical theory simple; in a more complicated example it 
would have been much harder to say by how much the techniques 
reduced the variance, but the techniques could still have 
worked. 

These techniques are by no means mutually exclusive, and 
can be combined with one another in many ways in a single 
problem. 


Use of Expected Values 


The simplest form of this technique is, in our terminology 
[2], the replacement of "hit-or-miss" by "crude" Monte Carlo. 
Suppose that we want to estimate the expected number of oc- 
currences of a certain event in the course of a process. The 


“hit-or-miss" method simulates the process and counts the num- 


256 D.C. Handscomb 





ber of times the event occurs. Now before each possible oc- 
currence of the event there is a stage at which we know that 
the probability of this event occurring is p, say. Instead 
of scoring 1 if the event then actually occurs and scoring 

0 if it does not, we can score p in either case; the expected 
score is unaltered. Then the variance of the score is re- 
duced by p(l-p) for every p scored. 

More generally, if at any stage of a simulation we can 
predict the expectation of the score to be added at some 
later stage, then we get a reduction in variance by scoring 
this expectation in place of the later actual score. 

Notice that we can apply this technique whether the scores 
are independent or not. 

Consider the following example. Suppose that we know the 
size and character of the population at time 0, and the prob- 
ability laws governing birth, marriage, and death, and want 
to know the size at time T. We simulate the progress of the 
population directly. The straightforward way of estimating 
the size at T is to count the simulated individuals alive at 
that time. 

We can improve on this in the first instance by looking at 
all individuals either in the original population or born in 
the interval between 0 and T, and scoring for each one the 
probability of its surviving until time T (from time 0 or 
from its time of birth, respectively), whether it in fact sur- 
vives or not. This cuts out the variance due to random times 
of death, and is valid in spite of the obvious dependence be- 
tween certain individuals (namely parents and their own off- 
spring). 

We could go farther back than this by considering marriages; 
if for each marriage we can score the expected number of off- 
spring alive at time T (adding to this again the probabilities 
that each individual of the original population survives to 
time T), then we cut out also the variance due to random times 


of birth and sizes of family (except of course the indirect 


Monte Carlo Techniques: Theoretical 257 





effect of the births in the previous generation). 


Stratified Sampling 

The object of the previous technique was to cut out the 
variance due to fluctuations close to the time of recording 
an observation; the somewhat complementary object of stratif- 
ication is to cut out the variance due to systematic fluctua- 
tions. Every sample obtained by simulation is assigned ac- 
cording to some rule to one of two or more previously-defined 
classes or strata, chosen so that the variation of score be- 
tween strata is substantially greater than the variation with- 
in each stratum. Instead of allowing the division of samples 
between the various strata to decide itself, we arrange to 
generate a fixed number of samples in each stratum. (If 
these numbers are not in the same proportion as the expected 
numbers, then we must weight them accordingly). 

In practice this often means discarding every sample found 
to fall in a stratum, once the requisite number for that stra- 
tum have been generated; for maximum efficiency, therefore, 
it is well that the criterion on which the stratification is 
based should arise as early as possible in the simulation 
process, so that unwanted samples can be discarded before 
much work has been wasted on them. 

To derive the maximum benefit from stratification, the 
number of samples in the jth stratum should be proportional 
to Pi VV; > where P; is the a priori probability that a sample 
falls in that stratum and v. is the variance of the score over 
that stratum; these samples should be weighted by a factor of 
1/Wv;. If we do not know the size of Mig however, we can do 
nearly as well by making the number proportional to Pj» Onan 
other words making the number conform precisely with expecta- 
tion. 

Consider the following example. A manufacturing process 
taking of the order of 10 days may be affected by the inci- 
dence of week-ends (when certain staff or services may not be 


available). In such a case it may be worth while to stratify 


258 D.C. Handscomb 


on the basis of the day of the week on which some important 


early stage of the process is reached. 


Importance Sampling 


We carried out a form of importance sampling in the course 
of stratified sampling when we introduced the factor of ae 
the idea then being to concentrate more of the sampling in 
the more "important" strata (in that instance the strata 
where the variances were highest), compensating for the bias 
by weights. 

The treatment of importance sampling in [2] is unsatisfac- 
tory for the kind of processes we are now considering. The 
argument in [2] ran as follows: if g(x) is any probability 
density function over [0,1] then we get an unbiased estimator 
of 


Q@ = i, Ax )idx 


by sampling X from the distribution with density g(x), and 
scoring f(X)/g(X); if g(x) is proportional to f(x) we then 
get a zero-variance estimator, so that we try to take g(x) as 
nearly as possible proportional to f(x). 

That is all very well if f(x) is a function given to us 
explicitly. l£ this represents) a, stage ot sa simulation pro- 
cess, however, f(x) can usually not be calculated but is it- 
self statistically estimated; what we really score is Y/g(X), 
where Y is a second random variable with expectation f(X) and 
variance v(X) (say). We can obviously no longer get a zero- 
variance estimator; in order to get a minimum-variance esti- 
mator we need to take g(x) proportional to /{f(x)2 + v(x)}, 


when the variance will be 
i i 
A V{£(x)2 + v(x) }dx]2 - Ne £(x)dx]?. 


In simple terms, the best distribution g(x) is proportional 
not to the mean but to the root-mean-square value of Y. 


Since there is no prospect of a zero-variance estimator, 


Monte Carlo Techniques: Theoretical 259 


there is no point in taking much trouble to find the best 
distribution g(x); it is more important from the point of 
view of efficiency to ensure that the distribution is easily 
sampled and that g(X) is easily calculated, letting it follow 
the r.m.s. value of Y in general shape only. 

Provided that f(x) and v(x) depend on x in not too erratic 
a manner, a plan of campaign could be first to estimate the 
r.m.s. values of Y for selected values of x by a preliminary 
(possibly simplified) simulation, then to fit a low-order 
non-negative polynomial p(x) to these values; then we proceed 


with the simulation proper, using importance sampling with 
1 
g(x) = p(x)/f, p(x)dx. 


Rosenberg [5] has described a particularly effective and 


easily-applied method following this plan. 


Control Variates 


The control-variate technique applies when there is an 
approximation to the process we are simulating that can be 
treated theoretically; we can then follow through the actual 
and control processes simultaneously, using the same random 
numbers at corresponding points in the two processes, and 
observe the difference between the two final scores. This 
difference is then an estimate of the correction to be applied 
to the theoretical results of the control process. This tech- 
nique cuts out the variance inherent in the control process, 
leaving only the variance of the error of approximation, 
which should be of a lower order of magnitude. 

The same principle can be applied when we want to compare 
the outcomes of two similar processes, differing in minor 
respects only. Instead of simulating them independently we 
simulate them simultaneously, with the same random numbers 
again. This imposes a high correlation between the outcomes 
of the two processes. 

Consider the following example for estimating the effect of 


a change in tax policy on profits. Instead of simulating the 


260 D.C. Handscomb 


running of the organization twice independently, once using 
the old tax structure and once using the new, we may be able 
to simulate it once only, scoring a gain or loss at every 
point where the tax deducted changes. 

Similarly we may at the same time be able to estimate the 
effects of changes of policy which have been suggested to 
accomodate the changed tax structure, still basing everything 
on a single simulation. 

We can even try to estimate the effect of an unknown change. 
Suppose we denote the change in a certain tax rate by $6; then 
at every point where this tax is deducted we score a gain or 
loss depending on 6, finally arriving at a total gain or loss 


which is still expressed as a function of the unknown 6. 


Antithetic Variates 


Re-use of the same random numbers in the control-variate 
technique can be thought of as a way of introducing a positive 
correlation between quantities to be subtracted. The anti- 
thetic-variate technique conversely introduces a negative 
correlation between quantities to be added. Its effective- 
ness depends rather heavily on the care with which we decide 
when to use it, since every application multiplies the min- 
imum number of samples we have to take by a factor of at 
lease 7. 

In the simplest form of the technique, at some point where 
we have to select a uniformly-distributed random number X 
from the range [0,1], we select instead a pair of numbers X 
and 1-X (each of which on its own is of course uniformly dis- 
tributed) and continue the process twice, based on each num- 
ber of the pair in turn. We then average the outcomes of the 
two processes. If there is negative correlation between the 
score arising from X and the score arising from 1-X, then the 
average of the two scores will have less than half of the var- 
iance of either individual score, and we shall have gained 


some efficiency. 


Monte Carlo Techniques: Theoretical 261 


(We have a choice of ways in which to continue the process 
from this point; we may let the processes based on X and 1-X 
continue independently, or we may continue to use complement- 
ary random numbers in the two processes. We may also, of 
course, let one or both processes divide again in the same 


way.) 
Another form of the technique selects a system of n > 2 
HUMDet Smee meet Meee AAn nek tno) /m, subtracting 1 


where necessary to bring them into the range [0,1], and simil- 
arly continues the process n times. This may be thought of 

as a form of stratification (since it ensures that one number 
falls in every interval of length 1/n) with the additional 


feature of correlation between the strata. 


Quasi-random Numbers 


The last technique that I shall mention is of a different 
character from the rest. Hitherto we have assumed that the 
"random numbers" used in our simulations have been "random" 
in every respect (which in practice means passing all the 
tests usually applied to a random number generator), and in 
particular that replications of the simulation have been 
statistically completely independent. Then, as we observed 
earlier, the average score X of n replications has a variance 
decreasing like 1/n. 

If we know, however, that we are in any case likely to need 
a large number of replications to bring the variance down to 
an acceptable level, we may be able to get a more rapid de- 
crease in variance by the use of so-called "quasi-random num- 
bers", which are less independent than random numbers but more 
evenly distributed. 

More strictly we should say 'quasi-random vectors", since 
the property that characterizes them is that if the m random 
numbers used in each replication of the simulation (supposing 
m fixed) are regarded as the components of a vector in m-dim- 


ensional space, then the n such vectors arising from the first 


262 D.C. Handscomb 


a 


n replications are distributed as evenly as possible over the 
m-dimensional unit cube. 

By various quasi-random schemes [1,3,4] it is possible to 
get the variance of X to decrease like 1/n”, or even faster, 
fon lange wn. 


Conclusion 


I have described a few of the variance-reducing techniques 
used in Monte Carlo work and shown how some of them apply in 
the simulation of real processes. When applied to rather 
artificial problems of mathematics and physics, techniques 
like these have produced very impressive results; one does not 
expect to be so impressed when one applies them to problems 
filled with awkward details and special cases, of the type 
arising in the simulation of real life, yet it is still pos- 
sible to gain enough efficiency to make them worth consider- 


ing. 


Bibliography 


1. Halton, J.H. "On the Efficiency of Certain Quasi-Random 
Sequences of Points in Evaluating Multidimensional In- 
tegrals,'' Numerische Mathematik, II (1960), 84-90, 196. 


2. Hammersley, J.M., and Handscomb, D.C. Monte Carlo Methods. 
New York: John Wiley §& Sons, 1964. 


3. Haselgrove, C.B. "A Method for Numerical Integration," 
Mathematics of Computation, XV GIS OI) ReSs25—sse 


4. Richtmyer, R.D. "A Non-Random Sampling Method, Based on 
Congruences, for Monte Carlo Problems," Institute of 


Mathematical Science New York University Report NY0-8674 
Physics, (1958). 


5. Rosenberg, L. "Bernstein Polynomials and Monte Carillomin= 
tegration," SIAM Journal Numerical Analysis, Lve (@ugiow)r 
566-574. 


Monte Carlo Techniques: 
Practical 


William A. Moy, University of Wisconsin 


Introduction 


Numerous techniques have been proposed for increasing the 
sampling efficiency of Monte Carlo simulations above that 
obtainable with simple random sampling. Some have been applied 
in particle-physics applications with outstanding success [16]. 
Some have been applied to simple examples of operational pro- 
blems [3,6,23]. But no applications of methods applicable to 
broad classes of operational problems have been reported. 

The purpose of this paper is to develop and compare four 
such variance-reducing techniques in the simulation of certain 
types of queuing systems. 

The systems to which we restrict our attention are charac- 
terized by regularly returning, after either a fixed time per- 
iod or fixed number of customers, to a known (generally empty) 
initial state. Many systems of practical importance are of 
this type. The classical toolcrib problem is an example, since 
the queue always is empty at the end of a work shift. For con- 
venience, we shall refer to any such system as a “periodic 
queuing system." 

The four techniques considered are (1) regression sampling, 
(2) antithetic-variate sampling, (3) stratified sampling, and 
(4) importance sampling. In each case, two alternative pro- 
cedures for applying the technique are considered. One is com- 
pletely independent of the system being simulated. The other 
is in some way dependent upon the particular system being sim- 
ulated. Both, however, are general purpose in that they may 


be applied in the simulation of any periodic queuing system. 


264 William A. Moy 





Problem Formulation 


Define the sample space Sp by 


(1) S, = {R= (Ry >-++>R)) oe 


oe i 


is a random variable uniformly 


distributed in the interval (0,1) 
and m 1s £ixed and finite} 


Thats so 
length m. 


R is the space of all random number vectors of 
Associated with every point r in Sp is a probability density 
f(r). If each of the elements in the random number vector are 


independent and identically distributed with 


Li 0 eee eee 
(2) £(ri) a 0, otherwise 


then the joint probability density function is given by 


(3) 


Li, Ons Eis oxy are ee eer in 
£() 0, otherwise 


Consider the random variable Y defined on Sp implicitly by 
a specified computer program. By this we mean that given a 
random number vector R = r the computer program maps this 
random vector into a value of the random variable Y. For gen- 
erality let this mapping be dependent upon a vector of input 
parameters yy, 1fen, Y= Guy this paper, the computer 
program will be, of course, a model of a periodic queuing 
system. 

Y could be either a vector-valued or scalar-valued function, 
but we shall restrict our attention to scalar-valued functions 
only. For example, Y might be the total customer waiting time 
during one period of the queuing system, or the total cost of 
operating the system during one period. 

We assume that the purpose of conducting the Monte Carlo 
simulation is to estimate the expected value of Y, and to 
obtain as accurate an estimate as possible with a given in- 


vestment in computing cost. Formally, this expectation is 


Monte Carlo Techniques: Practical 265 





given by 
(4) EC i, Se a Gea) dr 


The usual procedure for estimating E(Y) would be to gener- 
ate a sample of n random number vectors from the joint pdf f(r), 
to compute the value of Y for each, and to take the mean of 
these n observations of Y as the estimate. 

It must be realized that it is impossible to generate ran- 
dom number vectors which are truly random samples from f(r) 
since each number is limited to a finite number of digits. 
However, no practical simulation will be so pathological that 
this would make any significant difference. Since the integral 
form of E(Y) is much easier to work with than the corresponding 
form involving a summation, we shall just assume that a random 
number vector generated by the computer is a random sample 
from £Cr)r. 

Even though Y is seldom, if ever, known explicitly we do 
have a freedom in specifying its form which if recognized and 
exploited may enable certain increases in the efficiency of 
the Monte Carlo technique. It should be noted that the avail- 
ability of this extra freedom is one of the main differences 
between experimenting on physical systems and experimenting on 
simulation models of such systems. 

In this paper this extra freedom is used in a very elemen- 
tary way. We simply require that in all simulation models (i.e., 
computer programs) the correlation between each random number 
used in the simulation and the congestion caused by the simu- 
lated system event generated by that random number be positive 
and as large as possible. 

Thus, if random numbers are used to generate service times 
in the running of the program, we require that large random 
numbers generate long service times. Similarly, in generating 
customer interarrival times, we require that large random 
numbers generate short interarrival times. 

Such a procedure would insure, for example that Y(0), i.e., 
the value of Y when R = (0,...,0), is the minimum value of 


266 William A. Moy 





Y, and Y(1) is the maximum value. 
The four alternatives to random sampling are presented in 
the next four sections. Experimental results follow. 


Regression Sampling 


Consider the random variable Z defined on Sp by 


(5) Z(R;b) = Y(R) - X(R;b) + E[X(R;b)] 
where X(R;b) -- called the auxiliary variable -- is a random 
variable defined on Sp which depends on K parameters given by 
thesVecronmspe (Dy 5-++sby)- We require that the expectation 
of X be known numerically for any value of the parameter vec- 
LO. 

Clearly E(Z) = E(Y), and the variance of Z is given by 


(6) ViiZ)) = VOOR Vicor 2 CoviGe) 


Suppose that for a fixed total sample cost either n random 
Valle spOns aOR i (n, <n) random values of Z could be obtained. 
Sampling for Z will be preferred over simple random sampling 
aieh V(z)/n, < V(Y)/n.. A necessary condition for this inequal- 
ity to be satisfied is that Cov(X,Y) be large and positive, 
or, in other words, that the correlation between X and Yobe 
near) tl); 

The method proposed here is a generalization of a well-known 
sample-survey technique. The possibility of using this more 
restricted technique in Monte Carlo simulations has been recog- 
nized by serveral authors [16,17,18,20,26]. However, none have 
indicated how the technique can actually be applied to specific 
Monte Carlo problems, and none have reported any results with 
real simulation models. 

The problem of specifying the auxiliary variable may be con- 
sidered in two phases: that of specifying a good form for 
X(R;b) and that of specifying the parameter values for a speci- 
fic application. 

Consider the second problem first. Given X(R;b) in parame- 


Monte Carlo Techniques: Practical 267 


tric form, it is clear that b should be specified so that the 
variance of the estimate of E(Y) is minimized. It can be shown 
[22] that under certain reasonable assumptions this criterion 
is equivalent to choosing the b's so as to minimize the sum of 
the squared differences between the Xj and the Ys Waltiesios slbusy, 
given a sample of n observations, the optimum parameter values 
are given by the K equations 

n OX; n 9x; 
(7) Ieee 3B, 7 bi 3B,” a ek 


If b and z = (1/n) EZ; = (1/n) Ely; -x,+EQO ] are calculated 
from the same sample, then z will in general be a biased esti- 
mate of E(Y) because of covariance terms. The amount of the 
bias usually will be unknown, although, in special cases at 
least, it is known that z is a consistent estimator of E(Y). 

One approach to the elimination of the bias [17] is to split 
the sample into two or more groups; to estimate b, t separately 
from each group, and to use the average of the estimates from 
all other groups in calculating the auxiliary variables for the 
observations within any group. The X5 and b within any group 
are thus independent and z gives an unbiased estimate of E(Y). 

The variance of z determined in this manner will be somewhat 
greater than V(Z)/n, but may still be less than the correspond- 
ing V(Y)/n. We shall adopt this approach. 

Limited experimental results [22] indicate that for samples 
of the size likely to be found in simulation studies, no improve- 
ment is obtained by using more than two groups. The experi- 
mental results reported in the section below used this approach 
with two groups of equal size. 

To be a useful auxiliary variable, any candidate must pass 
three tests. First, the solution of the set of equations given 
by (7) for calculating the parameters from a sample must be 
computationally feasible. Secondly, given a set of parameter 
values, the calculation of the expected value of the variable 
must also be computationally feasible. Thirdly, when applied 


to the simulation of a periodic queuing system, it must yield 


268 William A. Moy 





a variance reduction that is greater than could be obtained by 
simply devoting the same amount of extra effort to simple ran- 
dom sampling. 

The imposed relationship between the random numbers used in 
generating the system events and the congestion caused by the 
events suggests that a system-independent auxiliary variable 
might be defined in terms of the sum or average of the random 
numbers used in generating the observation. Examples of such 


auxiliary variables are 


(8) X(R3b) = by + byER, 


(9) X(R3b) = b, + b 


2 
gen: cs Va 


Aves 2 
(10) —-X(Rsb) = by * BIER; + b5(ER;) 


where the summations are for j = 1,...,m. Each of these sat- 
isfy the first two requirements in the paragraph above. 
System-dependent auxiliary variables may be defined in an 
endless number of ways. Two general types will be considered. 
The first includes auxiliary variables that are defined 
directly on the random numbers after they have been classi- 
fied according to their use in the simulation. The second type 
includes those that are defined on the events generated by the 
random numbers rather than directly on the random numbers. 
Suppose that the numbers in the random number vector R are 
regrouped such that the first m, of them are used to generate 
system events of type 1, the next m, are used to generate 
events of type 2, and so forth. Within each subgroup, let the 
numbers be listed in the order in which they are used. Let 
this vector of random numbers be designated R*, and let the 
vector R* be the vector of the first m 


1 il 


tor RS the next m, numbers in R*, and so on. Then both types 


of system-dependent auxiliary variables are special cases of 


numbers in R*, the vec- 
ees 


an auxiliary variable of the form 
= * * 
(aly) X £, (R37) + £,(R5) Sareea ct £, CRF) 


where E[£; (R*)] is known for all i. 


Monte Carlo Techniques: Practical 269 





For the first type of auxiliary variable to be considered, 
£; (R¥) may be any function containing the random numbers and 
unknown parameters to be estimated from the sample observa- 
tions. 

For the second type, f; will be a composite of two functions, 
say 8; and h;, where h; is the function that generates a sys- 
tem event of type i from a random number, and gi is a function 
of such a system event whose parameters are estimated from the 
sample, i.e., £; (RF) = g; (h; (R¥) ]. 

In general, each term of (11) will involve at least one un- 
known parameter which must be estimated from sample data. Since 
the solution of (7) becomes unwieldy if the number of parameters 
is large, say more than 10, L in (11) must be restricted. 

If the number of event types generated in the simulation is 
small, this method may be applied directly. Otherwise, the ef- 
fective value of L must be reduced through the elimination of 
terms which will likely have small effect on the correlation 
between X and Y or through the aggregation of terms likely to 
have similar parameter values. Our experience indicates that 
if the analyst applying the technique has adequate knowledge of 
the system being simulated, he can successfully perform this 


elimination or aggregation on an intuitive basis. 


Antithetic-Variate Sampling 
Consider next a random variable Z defined by 
ray ez(Ry =" ay (RY +" (1-0) XR) 


where X(R) is any ‘random variable whose expectation is known to 
bemequaleito EGY) and 0) < ao < I.) The mean of Z as E(%)) and the 


variance is given by 
(13) ViGZ) Poec SMi(Y)ot a (1-a)/2 VO) + 020(4-0)GouCXey) 


If the correlation between X and Y is sufficiently close to 
-l, it is clear that V(Z) may be less than V(Y) for a given 


sampling expenditure. 


270 William A. Moy 


a—_——$—$—$—$—$—$—$—_—_—————————————— LLL 


This method of sampling is known as the method of antithetic 
variates. It isa special case of general correlated sampling 
methods. It was first expounded in a paper by Hammersley and 
Morton [ll] in 1956. It has since been discussed and expanded 
upon in several other works by Hammersley and his collaborators 
[8,9,10,12,21], by Tukey [26], and by Page [23]. Marshall [20] 
also used the idea, but without referring to the technique by 
name. 

The problem in applying this method in any simulation is to 
be able to find a suitable auxiliary variable X. In certain 
special cases rather ingenious ways have been devised for speci- 
fying X. Marshall [20] gives one such example. But such 
methods lack generality. 

A very simple system-independent method for obtaining X is 
to let X(R) = Y(R*) where the vector R* = (1-R). That is; the 
auxiliary variable is computed by rerunning the simulation using 
the complements of the random numbers. 

If the simulation model is such that the ith random number 
in each sequence will surely generate the same system event, 
then it is likely--and in some cases certain--that Cov(X,Y) 
will be negative. This is true regardless of whether the cor- 
relation between the random numbers and the congestion gener- 
ated by them is forced to be positive for all types of system 
events. 

If the simulation model is such that the ith random number 
in each sequence may not generate the same system event, then 
imposing the relationship between the random numbers and the 
events they generate may still cause Cov(X,Y) to be negative. 

A potential advantage of regression sampling and antithetic- 
variate sampling over the two methods which follow is that they 
are essentially applied after the fact. That is, a Monte Carlo 
simulation is performed using random sampling methods, and only 
then are the values of the auxiliary variable X calculated and 
the Z's obtained. These two stages are functionally completely 


independent, except that in performing the random sampling, 


Monte Carlo Techniques: Practical 271 


provisions must be made for obtaining any data which will be 
needed later in calculating the X's. 

Thus the simulation aspect of simple random sampling is pre- 
served. In those cases where one wants to "observe" the sim- 
ulated operation of the system in addition to estimating the 
expected value of some system property as accurately as pos- 
sible, this may be an important consideration. 

A second advantage is that they can be applied to any exist- 
ing simulation program with only very minor changes. 


Stratified Sampling 


Let Sp be partitioned into L mutually exclusive and exhaus- 


S 
ipa ueg 
a random sample of size Mh is obtained, and the resulting values 


tive subspaces S 9S) called strata. From each stratum 
of the variable Y are averaged as Yue From these L values of 


Yh a pooled estimate of E(Y) is obtained from 
L 


(14) We = d WhYh 
where Whe the stratum weight, is the probability that a ran- 
domly selected R will be in stratum h. 

It can be shown that BGwee) is an unbiased estimate of the 
population mean uy of the random variable Y, and thus that this 
procedure may be used as a substitute for simple random sam- 
pling. Such a sampling scheme is known as a stratified sampling 
procedure. 

Stratified sampling is a commonly used technique in sample 
survey work. The general theory is presented in many places. 
Among the most complete presentations are the books by Cochran 
[4], and Dalenius [5], and Hansen, Hurwitz and Madow [13,14]. 
The paper by Evans [7] is also very useful. 

The use of stratified sampling in Monte Carlo simulations 
has been discussed by Albert [1], Clark [2], Ehrenfeld and 
Ben-Tuvia [6], Kahn [17], and Tocher [25]. However, these dis- 
cussions have all lacked generality, and no methods of stratif- 


ication have yet been proposed which may be applied to prac- 


272 William A. Moy 


tical digital computer simulations of queuing systems. 
Assuming an infinite population, the variance of ve is 
given by 


18) VF) = ¥ we 02 
( st Pathos Rani 
where oy is the variance of Y over stratum h. 

Consider an allocation of the total sample n among the strata 
such that ny, is proportional to the stratum weight, i.e., a 
nW, . Denoting the estimate of E(Y) obtained by this method 


of stratification as Toe it can be shown that 
L oO 
- $ 252) ViGo aD 
(16), 5) AVG RES) Pec) } W, o, = OO . th 


=] 
where eo = FW (up -H)? and Uy is the expected value of Y 
over stratum h. 

Note that M@es sop will be less than the variance of Y 
obtained from random sampling unless WyFHos ee Fy and that it 
can never be greater. Note also that this equation implies 
that the stratification should be done so that the Hy differ 
widely. 

The variance of a stratified sampling estimate can be fur- 
ther reduced if the sample size in each stratum is made pro- 
portional to the stratum standard deviation. However, in prac- 
tical sampling problems the stratum standard deviations would 
not be known exactly, and estimates would have to be used. This 
would tend to increase the variance of Y and might, in fact, 
lead to poorer results than could be obtained with proportional 
allocation. For this reason, the discussion below will be 
limited to proportional allocation. 

The general requirements that must be met by any stratified 
sampling plan are: 


1. A stratification variable must be specified [5]. 

2. The probability function of the stratification variable 
must be known. 

3. The number of strata must be specified [5]. 

4. The stratum limits must be specified in terms of the 
stratification variable. 


Monte Carlo Techniques: Practical ZS 
ee 


5. The total sample size and the size of the sample to be 
taken from each stratum must be specified [5]. 

6. A method for randomly obtaining the sample from each 
stratum must be specified. 


In order to want to use stratified sampling rather than 
simple random sampling, either (1) the variance of the param- 
eter estimate obtained from stratified sampling must be less 
than the variance obtained from random sampling for a given 
total cost, or (2) for a given fixed variance of the parameter 
estimate, the cost with stratified sampling must be less than 
the cost with random sampling. 

Any random variable defined on the elements of the sample 
space Sp with known probability function is a possible strati- 
fication variable. A good stratification variable is one that 
is highly correlated with the variable Y, and whose cumulative 
distribution function F(X) is known and readily solvable for 
X given a value of F(X). 

Consider the stratification variable 

m eC 
(17) HUGS Pith where b=) 0) oe (0<C<1) 
i=] 0 tt R;<C 

As a consequence of the imposed relationship between the 
random numbers and the congestion caused by the events they 
generate, this stratification variable would likely be posi- 
tively correlated with the total customer waiting time and 
thus might be a satisfactory stratification variable. 

The probability distribution of this variable is the bi- 
nomial with p = (1-C), i.e., 


m-k 0O<C<l1 
Zak=0\ 5. cs, 


The stratum weights and the conditional probability that X 


(18) P(X=i) =e(1y (oe) 6 


equals k given that X is in stratum h are easily obtained from 
thas” 

The generation of a random value of the stratification var- 
iable and the generation of a random number vector having the 
correct number of observations greater than C are also straight- 
forward. 


274 William A. Moy 
rr 

One potential difficulty with this scheme should be noted. 
The length of the random number vector must be constant. When 
the simulation program does not also require this, the vector 
length for this procedure must be fixed at, or slightly above, 
the largest value likely to be required. This may be difficult 
since little or no information may be available for making a 
suitable choice. Furthermore, if the vector length is made 
longer than necessary, the advantage of stratification will be 
diminished. 

System-dependent stratified sampling can also be conceived. 
The general approach would be to use a multi-dimensional stra- 
tification variable (vector) with each component the stratifi- 
cation variable for one event type or group of event types. 
Such procedures would generally not be practical, however, 
since the number of strata must be kept small, say less than 
10, and the dimensionality of the stratification vector is 
thus restricted to at most three. 

An alternative approach would be to select one or two event 
types or one or two classes of event types and to use the strat- 
ification scheme outlined above with just these events. This 
approach would tend to be successful if the events selected were 


highly correlated with Y. 


Importance Sampling 


The final alternative to random sampling is obtained by sub- 
stituting for the process under study a substitute process hav - 
ing the same expected value but smaller variance. 

Define the random variable Z on Sp in terms of Y(2)jet@0), 
and £* (2) (by, im 
(19) ZCe)) = (Yi) tet Cad 
where Y(r) is a random variable whose mean is to be estimated, 
f(r) is the joint probability density function of r, and f*(r) 
is a joint probability density function which 15 not zenomtor; 


any r. 


Monte Carlo Techniques: Practical ZS 





Clearly, the expectation of Z with respect to the joint 
pdf £* is equal to the expectation of Y with respect to the 
joint pdf f. Thus sampling from f*(r) and estimating E(Z) rep- 
resents an alternative procedure for estimating E(Y). 

The variance of Z is given by 


aE Ye ze 
Gay we =i aes of SE ge (y de een? 
0 0 £* (5) 


Note that if Y(r) > 0 for all r, this variance would be zero 
if f*(r) could be made equal to Y(r)f(r)/E(Y). Thus the pos- 
sibility of a large variance reduction is present if the pass -£* 
is chosen judiciously. 

The name importance sampling has been given to any such 
sampling procedure in which sampling is performed from a sub- 
stitute joint probability function f* and in which the vari- 
able Z is substituted for the variable of interest Y. 

The main references to importance sampling are papers by 
Kahn and Marshall [18], Kahn [15,16, 17], Marshall [19,20], 
Clark [3], and Ehrenfeld and Ben-Tuvia [6]. The application 
of importance sampling in the simulation of periodic queuing 
systems is not discussed in any of these although Clark, and 
Ehrenfeld and Ben-Tuvia give expository examples of the use 
of importance sampling in simulating simple queuing systems. 

As noted above, if f* were optimally chosen, the variance 
of Z could be reduced to zero. However, in order to do this 
the function Y and its expectation would have to be known, and 
sampling would be unnecessary. 

Practically, the best that can be expected in the use of 
importance sampling is that a "good" form for f* will be 
specified. This problem of specifying f* may be considered in 
two parts: (1) that of specifying a good general form for f* 
involving one or more parameters, and (2) that of specifying 
the parameters. 

The requirements for a good parametric form for f* in the 
Simulation of periodic queuing systems can be deduced from the 
expression for the optimum f*, i.e., £*%(r) = ¥(eyECE)/ECY) . 


276 William A. Moy 


Consider the two random number vectors Ty = (0) 0 bam O0) 
and 5 (1,1,...,1). For the simulation models of this paper 
Y (19) < Y(r,), and, ‘since \f(7)i-= \l4toragdil ong £* (ro) should be 
less than £*(r,)- Assuming that the random numbers are inde- 
f*(1r;) this would imply 


=] 
that £* (0) must be less than £*( ). In a similar manner it can 


pendently drawn so that f*(r) = I 
1 
be argued that f*(r) should be a non-decreasing function of r. 

Two additional properties which a suitable substitute dis- 
tribution must possess are: (1) it must be computationally 
feasible to generate a random value from the distribution, and 
(2) it must be computationally feasible to calculate f*(r) 
once the parameters have been specified. 

A substitute pdf which satisfies all these criteria and 
which gave the best results of those subjected to testing [22] 
is 
(21) f¥(r,30,) = a; )[1n(a,)1/(a,-1) 

eS Sele eT al i i 
where a5 is a parameter used in generating the i random number. 
Experimental results reported on in the next section. 

For each simulation model and substitute density function 
there is an optimum parameter vector, say Oy » such that 
V(Z3a9) < V(Z3;a) as a ranges over all values. The main pro- 
blem in using importance sampling with a parametric form of 
the substitute distribution is thus in specifying a so that 
close to this maximum variance reduction will be obtained. 

In practical problems it would be impossible to provide sep- 
arate estimates for each of the a's. There must be some aggre- 
gation. In the simplest case the same value would be used for 
all the a's. In other applications it might be desirable to 
use separate values of qa for each type of system event. 

The problem of obtaining a good estimate of a, can be ap- 
proached in a variety of ways. Among these are: 

1. Specify a before sampling using any available relevant 
information to improve the choice. 

Clearly, this presents no theoretical difficulties, and if 


V(Z;a) turns out to be close to V(Z5a9) ed there Decausemm 


Monte Carlo Techniques: Practical Zine 
Se es 


was close to Ay or because V(Z) was relatively insensitive to 
changes in oa in the interval in which a and A lie -- this may 
be a good strategy to pursue. 

The experimental results reported in [22] indicate that a 
may be relatively constant for a particular simulation model 
as the values of the system parameters change over relatively 
narrow ranges. This indicates that it may be possible to fair- 
ly accurately estimate Xp from previous simulations of similar 
systems. 

2. Apply double sampling using the first sample to estimate 
Xp and the second to estimate E(Z). 

Consider the case with O,=a5=...-O =O. A necessary condi- 
tion for obtaining the optimum parameter value is that 


[aV(Z;a) /3a] |a=a, vanish, i.e., that 


1 1 
2 2 2 oie (25/0) 
@2) ao Eel) ae J oe (p [2 ed) tho = 


0 0 £*"(r30) 


assuming, of course, that f*(r;a) is differentiable. 
This integral may be treated as defining the mean with re- 
spect to the pdf f(r;a) of the random variable W defined on 





Sp by K(y- 
R Y?(r) f(r) [-28 (ris) 
(23) W(r;a) = a 
Eales) 
More generally, we may write (22) in the following form 
2 2, pies ieyestO) 
L pert Ce) ar 7) ER (r ca) 
(22a) OVICZ SO), = J =? 1 
30 ; : fe“ (ne) £*(r;a,) 


where £*(r5a,) is the substitute pdf with a fixed value of the 
parameter. The integral may then be treated as defining the 


mean with respect to £* (1504) of a random variable defined on 
S, by Soak 
R y(n) £? (xy [-2* Ga) 


(23a) W(rja) = ——, “ 
£*" (130) £*(r3a,) 


278 William A. Moy 





Since Y(r) is unknown, (22) and (22a) cannot be used direct- 


ly in estimating a But, if sample data were available, oa 


could be soasauenalee equating the sample estimate of E(W) is 
zero and solving for a. In a double sampling scheme, the 
first sample would be used for this purpose. 

Such a procedure introduces a further restriction on the 
form of f£*(r;a), namely, that the solution of the equation w = 
0 for a must be computationally feasible. Many of the sub- 
stitute density functions which we have considered proved to 
be unsuitable because of this restriction. 

The obvious problem with this form of double sampling is 
that sampling to estimate 9 expends some of the resources 
which would otherwise be available for estimating E(Y). 

3. Apply double sampling using the first sample to estimate 
OX but basing the estimate of E(Y) on a combination of the two 
samples. 

Suppose a sample of ny observations were obtained from a 
POnneespidse £*(r;a,) and a second sample of n, observations were 
obtained from a joint pdf £*(r5a5). Then, if Zy and Z5 are 
respectively the sample means of the variables zy = ¥(rJf(r)/ 
f*(r;a,) and Z, = Y(r)f(r)/f*(r30,), 

(24) = Wy2 + wZ, 

is an unbiased estimate of E(Y) provided Wy and W, are inde - 

pendent of the sampling results and w, + wy = 1. In particu- 
Kars Heit a> = Gy where dy is an estimate of Oy which ismcak— 

culated from the first sample using the method of the second 

approach to estimating % given above. 

The difficulty with such a procedure is in estimating V(Z) 
for use in confidence interval statements. This variance is 


given by 
a Pa ee a Oke 
(25) V(Z) = wiV(Z,) + w5V(Z.) + 2w,w,Cov(Z, ,Z>) 
Estimate of V(Z1) and V(Z,) can easily be obtained from the 


samples, but an estimate of the covariance cannot be obtained 


from a single sample. 


Monte Carlo Techniques: Practical 279 





We conjecture, however, that the covariance will be negli- 
gible and can be neglected with little error. Limited experi- 
mental results reported in [22] seem to support this conjecture. 

There remains the question of specifying Wy and w,- We sug- 
gest that crude, but perhaps satisfactory, first guesses for Wy 
and W> could be obtained by ignoring the covariance term and 
optimizing V(Z) with respect to Wy: The result is 

cn, 
(26) ae ae ny * cn, 
where n, and n, are respectively the first and second sample 
sizes and c is an estimate of the ratio of V(Z,)/V(Z}). 

These three general approaches to estimating Gy by no means 
exhausts the possibilities. In particular, we have not con- 
sidered sequential procedures in which the estimate of X is 
updated as each new simulation observation is calculated. Such 
procedures, however, would compound the difficulties of obtain- 
ing unbiased estimates of E(Y) and/or of estimating the vari- 
ance of Z. 

Importance sampling with a single alpha is completely gen- 
eral purpose and system independent. With two or more alphas 
the procedure becomes system dependent. 

A special form of system dependency is obtained by using 
substitute density functions for generating only a portion of 
the simulated event types. In this procedure the event types 
would be partitioned into two sets, with one containing all 
types in which the average event values are highly correlated 
with Y. Substitute density functions would then be used for 
generating the events of this set, but not for the events of 
the other set. This procedure would yield a smaller variance 
reduction than could be obtained using a separate substitute 
density function with each event type, but it would also be 


Significantly less costly. 


280 William A. Moy 


Experimental Results 


The value of any alternative sampling procedure in the sim- 
ulation of periodic queuing systems can only be determined from 
many actual applications. As a start toward this evaluation, 
in this section we report on some experimental applications of 
the techniques described above in the simulation of six vari- 
ations of a simple single-server system and in the simulation 
of a complex system more of the type likely to be encountered 
in practice. 

The six variations of the single-server system, operating 
with a first-come first-served queue discipline, were obtained 
by varying the utilization factor p, the simulation period, 
and the forms of the service-time and interarrival-time distri- 
butions. Results obtained with the system-independent versions 
of the techniques described above are summarized in Table l. 

These results clearly show that all four techniques are 
capable of significantly decreasing variability in the simu- 
lations of simple queuing systems. They also indicate that 
the proposed method of stratified sampling is inferior to the 
other three methods and that the regression sampling method 
used is probably inferior to antithetic-variate sampling and 
importance sampling. In comparing antithetic-variate sampling 
to importance sampling in these applications, it must be noted 
that the former is the easiest to apply. 

The second simulation model used in evaluating the candidate 
sampling techniques was a model of a hypothetical, but real- 
istic, truck dock system. This model was motivated by the sys- 
tem studied by Schiller and Lavin [24]. The model is intended 
to represent the operation of a truck loading and unloading 
facility at a plant or warehouse operating on a one eight-hour 
shift per day basis, and, we believe, is more or less typical 
of the type of periodic queuing problems to which Monte Carlo 
methods might be applied. 

This model differs from the single-server model in the fol- 


lowing important respects: (1) it consists of several servers 


“(T-2)/,P(O)UT = (4) yF UOT ZoUNZ Ajtsuep 93nzTysqns YITM 

OT Pay Away SNS 

pue OM} U99M}9q YIM PpouTeIgO 910M pue asay} FO 3Seq OYR OFe pazIodeL sz[NsSeI 
dy, ‘Ua, puUe OM} UEMZEq BRIS FO SLOquNU sNOTIeA OF peuTeiqo e1eM soTdues 
‘woisks yoeo 1Oq *Z/T = 9D YITM (LT) AQ USATB OT qetTAeA UOTJeITFIISIYS 9Yyd YITM 


*soqlewtiss B3uttdwes wopuei oy} Sutjiei1sues ut pesn si0q 

-D9A Iaqunu wopueil oy. FO sqzUsWaTdWod 9y, Butsn poute}zqoO aL19M SazeTIPA ITIOYITIUY 
ein) sa + fax2q + Tq = x orqetaea Azet{rxne oud WITM 

*pasn o10M SUOT}JEALOSGO QOT 10 9g a190ym BuT{dues uoTSsei301 YIM qdaoxa 
SUOTIEALTOSGO pae[NWTS YNZ IO SLT 19YITS 910M SJUSUTIedxe |asaYyy 1OF sazTsS otTdwues 


ee 


9=1 Z=1 
¢s 92 ZL ZZ Bue[ig  sueli1g 0S 8°0 
Z=1 9=1 
ee ST 0 ne BueTig  suey[ig 0S 8°0 
ev ST OF Lv *dxg *dxg 0S OFT 
Lv 9¢ ev EZ *dxqg *dxg 0S 9°0 
0S ST 61 os *dxq *dxg 002 8°0 
SdY %97 %9¢ %9¢ *dxg *dxg ks 8°0 
g8utrdues pSuttdues suttdues zsuttdues *4SsTq “4stq “sqo 
aoue .soduy potfrze1ys Soqerreg UOTSSOIZ9Y DSOTALES TeATIIy /°3sSN)D 


-OTP9UTIUY = 

sleqoweleg we sks 

0 ee eee ee ee ee ee ee Ee 
psue sks LaALOS-oTBuTS 9Yyi 0} pottdde spoyiyew JUuepuedeput 

wo2sXS YZIM PpeUTe igo SUOTJONper sdueTIeA FO SuOsTieduojD “TT STqQeL 





282 William A. Moy 





in parallel; (2) each customer belongs to one of four distinct 
classes and each class has its own service time distribution; 
(3) service time distributions are discrete; (4) the inter- 
arrival time distribution varies over the simulation period 
reflecting start-up, shut-down, and "lunch-time'" conditions; 
(S) service channels are closed during a "lunch period"; and 
(6) the simulation period is expressed in terms of a fixed 
time period -- one simulated day -- rather than in terms of a 
fixed number of customers. 

The last difference is of special significance. It makes 
the number of random numbers used in obtaining a simulated 
observation a random variable rather than a constant, and, as 
noted earlier, this poses certain additional difficulties in 
applying some of the sampling techniques. 

Additional characteristics of the system are exponentially 
distributed service times, single queue, and first-come first- 
served queue discipline. 

For this system we shall use as the variable of interest Y 
a cost function incorporating the total waiting time of all 
customers during the simulation period plus an over-time cost 
incurred by operating the facility beyond the end of the "shift" 
in order to complete service on all trucks having arrived by 
an earlier deadline time. 

Two versions of this model were used in the experiments: 
one with eight service channels and one with nine. 

Preliminary results with system-independent methods (Table 
2) failed to show sampling efficiencies with either stratified 
sampling or importance sampling. We thus discarded these two 
methods. 

A final series of experiments was run using both system- 
independent and system-dependent versions of regression samp1l- 
ing and antithetic-variate sampling plus system-dependent ver- 
sions of importance sampling. The results are detailed in 
Mable si. 

The auxiliary variables used in the four versions of regres- 


sion sampling were: 


Monte Carlo Techniques: Practical 283 


Table 2. Comparisons of Variance Yreductions 
Obtained with system-independent methods 
applied to the truck dock system! 





Number Antithetic- 
of Regression Variate Stratified Importance 
Channels Samp ling2 Sampling Sampling4 Sampling> 
8 9% 18% Increase Increase 
9 10 27 Increase Increase 


1. Sample sizes were 180 with channels and 360 with 9 
channels except with regression sampling which was 
applied to the antithetic-variate sampling results 
also, thus doubling the sample size. 


2. With the auxiliary variable X = by + ye 4: bsERS. 


3. Antithetic variates were obtained using the comple- 
ments of the random number vectors used in generat- 


ing the random sampling estimates. 


4. With the stratification variable given by (17) with 
Ge a /e2e. 


5. With the substitute density function f*(a) = In(a)at/ 
(a-1). 


“s00T {[(etdwes wopuer jo awt} Sutuuni zejzndwod) (Sutpdwes wopuer yiIM 
aouetiea a8ei9ae)]/[(eanpedoid Butytdues aatjeurez{e FO aut Buruuns 1aznduos) (einpss0id Suttdues 
AATJEUIIITE FO adueTIeA aseiaae)] - [} = UOT}JINper aduetiea peztteurou ‘ATTeITFIIedg *SadUeLazFIp 
aut} Buryndwod ay2 10F pajzIeI10D SUOTJINPeL SULTILA Po eUTISe 1B SUOTJINpal sdULTIEA PaZTTeWION “T 





ey S9° 
$l $L9 
ol $69 
608 0S6 





siajoweieg| 1aqzoweieg Tetqied Il I Sutptdues I sTouueyo 
inoj aud poyrew |poyrew | poyrew wopuey adTALaS 


suttdues suttdues Sut{tdues uorssaisoy 1a quny 
aoueqioduy a eTICA 


-9T2043TIUY 








wozSsAs YOOp-yIn1}2 9y} 07 pottdde spoyzow Juspusdep-wezysds YIM 
peutezqgo (sejnutw) soewti Sutuuni 1royndwod pue ;suotzINpet soueTIeA poZzt 
-Tewiou ‘3uttdues wopuel “SA SUOTJINPSL sd.UeTAIPA ‘SODUPTIPA OBPLOAY “¢ OTGR] 


Monte Carlo Techniques: Practical 285 





: x a2 
Method I: X(R) = by + bjRy + b3Ry 
Method II : X(R) = by + b5R, 
Method III: X(R) = b, + b,A, 
Method IV : X(R) = by + b5Ry + b2R + b,R, + beRy 


Where R,> 


used in generating, respectively, interarrival times, truck 


Ro; R, and Ry are the means of all random numbers 


type, basic service times, and service time additions based on 
the use being made of the truck; and Ay is the mean of all 
interarrival times. 

Two versions of antithetic-variate sampling were used. In 
the version labeled "full " each antithetic observation was 
obtained using the complement of the random number vector used 
in generating the corresponding random sample. In the version 
labeled "partial" the complements of the random numbers used in 
generating interarrival times in the random sample were used 
in generating interarrival times in the antithetic observation, 
but new random numbers were generated for the other three 
events. 

The one-parameter version of importance sampling used a sub- 
stitute pdf only for generating interarrival times. Other 
simulated events were generated using uniformly distributed 
random numbers. The four-parameter version used a separate 
substitute pdf in generating each of the four types of system 
events. In each case the sample was split into halves; random 
sampling was used for the first half; the parameter estimate(s) 
for the substitute pdf was calculated, and importance sampling 
was used for the second half of the sample. The random and 
importance sample were then combined into a single estimate 
Sone (eZ4)) and (26) wath (© = 75% 

All variances shown are the average of ten samples. As in- 
dicated, each sample contained either 50, 100, or 200 observa- 
tions, or simulated days, of system operation. Total computer- 
running times are shown. These were obtained from the internal 
clock on the CDC 3600. 


286 William A. Moy 


Conclusions 


The experimental results reported in Tables 1, 2, and 3 
appear to support the following tentative conclusions regard- 
ing the merits of the sampling techniques presented in this 
paper when applied in the simulation of periodic queuing sys- 
tems. 

1. Alternatives to random sampling exist that are generally 
applicable and significantly more efficient. 

2. System-dependent methods are more efficient than system- 
independent methods. 

3. System-dependent antithetic-variate sampling and system- 
dependent importance sampling are more efficient than corre- 
lated sampling. 

4. Multi-parameter importance sampling is less efficient 
than single-parameter importance sampling. This is primarily 
due to the larger computing time needed for calculating param- 


eter estimates. 


Bibliography 


1. Albert, G.E. "A General Theory of Stochastic Estimates of 
the Neumann Series for the Solutions of Certain Fredholm 


Integral Equations and Related Series,'' Symposium on 
Monte Carlo Methods. Edited by H.A. Meyer. New York: John 
Wiley & Sons, Inc.; 1956. 


2. Clark, C.E. "The Utility of Statistics of Random Numbers," 
Operations Research, VIII (1960), 185-195. 
3. Clark, €.E. “Importance Sampling in Monte CarlovAnalyses. 


Operations Research, IX (1961), 603-620. 


4. Cochran, W.G. Sampling Techniques. 2nd ed. New York: John 
Wa hey (G Sons , Since lo Sk 


5. Dalenius, T. Sampling in Sweden - Contributions to the 
Methods and Theories of Sample Survey Practice. Stockholm: 
Almqvist and Wiksell, 1957. 

6. Ehrenfeld, S., and Ben-Tuvia, S. "The Efficiency of Stat- 


istical Simulation Procedures," Technometrics, IV (1962), 
ee 


Monte Carlo Techniques: Practical 287 


10. 


aie 


eZ 


eS 


14. 


ED 


Or 


Wo 


Sie 


OR 


20. 


Zils 


Lilie 


Evans, W.D. "On Stratification and Optimal Allocation," 
Journal of the American Statistical Association, XLVI 
19S) peo S SOM 


alton sdaseandeHandscomp.™ D.C swAN Method eho pincreas— 


ing the Efficiency of Monte Carlo Integration," Journal 


of the Association for Computing Machinery, IV (1957), 
329-340. 


. Hammersley, J.M., and Handscomb, D.C. Monte Carlo Methods. 


New York: John Wiley §& Sons, Inc., 1964. 

Hammersley, J.M., and Mauldon, J.G. "General Principles 
of Antithetic Variates,'' Proceedings, Cambridge Phil- 
osophical Society, LII (1956), 476-481. 

Hammersley, J.M., and Morton, K.W. '"'A New Monte Carlo 
Technique: Antithetic Variates," Proceedings, Cambridge 


Philosophical Society, LII (1956), 449-475. 
Handscomb, D.C. "Proof of the Antithetic Variates Theorem 


for n>2," Proceedings, Cambridge Philosophical Society, 
LV elo Ss) ems O OES One 


Hansen, M.H., Hurwitz, W.N., and Madow, W.G. Sample Survey 
Methods and Theory, Vol. I: Methods and Applications. 
New York: John Wiley §& Sons, Inc., 1953. 


Hansen, M.H., et al. Sample Survey Methods and Theory, 
Vol. II: Theory. New York: John Wiley & Sons, Inc., 1953. 

Kahn, H. 'Modification of the Monte Carlo Methods," Sci- 
entific Computation Seminar Proceedings, IBM Applied 
Science Department, (1949), 20-27. 

Kahn, H. "Use of Different Monte Carlo Sampling Tech- 
niques ,'' Symposium on Monte Carlo Methods. Edited by 
H.A. Meyer. New York: John Wiley & Sons, Inc., 1956. 

Kahn, H. Applications of Monte Carlo. The RAND Corpora- 
tion, RM-1237-AEC, 1956. 


Kahn, H., and Marshall, A. '"'Methods of Reducing Sampling 
Size in Monte Carlo Computations," Journal of the Oper- 
ations Research Society of America, I (1953), 263-278. 

Marshall, A.W. ''The Use of Multi-Stage Sampling Schemes 
in Monte Carlo Computations,'' Symposium on Monte Carlo 
Methods. Edited by H.A. Meyer. New York: John Wiley §& 
Song 5 Wine. , W5o. 


Marshall, A.W. Experimentation by Simulation and Monte 
Carlo. The RAND Corporation, Paper P-1174, 1958. 


Morton, K.W. "A Generalization of the Antithetic Variate 
Method for Evaluating Integrals," Journal of Mathematics 
and Physics, XXXVI (1957), 289-293. 


Moy, W.A. "Sampling Techniques for Increasing the Effi- 
ciency of Simulations of Queuing Systems .'' Unpublished 
Ph.D. dissertation, Northwestern University, Evanston, 


288 


So 


24. 


Zoi. 


20% 


William A. Moy 


Tie, AUSGSs 

Page, E.S. On Monte Carlo Methods in Congestion Problems: 
II, Simulation of Queuing Systems ,"' Operations Research, 
XIII €1965)), 300-305). 


Schiller, D.H., and Lavin, M.M. "The Determination of Re- 
quirements for Warehouse Dock Facilities," Operations 
Research, IV (1956), 231-243. 

Tocher, K.D. The Art of Simulation. Princeton, N.J.: D. 
Vian NOSitmanda Gols malmcrr, : 


Tukey, J.W. "Antithesis or Regression," Proceedings, Cam- 
bridge Philosophical Society, Tt elo Sy/alae 73-080" 


Monte Carlo Techniques: 
A Comment 


Jack P. Kleijnen, Tilburg University 


Before Dr. Handscomb starts discussing various variance re- 
duction techniques, he states that "these techniques are by no 
means mutually exclusive.'' Actually there are two techniques 
that may conflict. These two techniques are "control variates" 
and "antithetic variates". As the simultaneous application of 
both techniques is implicitly recommended by Dr. Handscomb and 
many other prominent authors on simulation, it seems useful to 
discuss the possible conflict between both techniques in some 
detarl. 

The form of control variates we are referring to, consists 
of using the same sequence of random numbers in the simulation 
of two similar systems. Common random numbers create positive 
correlation between the responses of the two simulated systems. 
This positive correlation is desirable. For, if X5 the response 
of the first system in the ith run or replication (i=1,...,n) 
1S, Say, above its true level, then Yq» the corresponding re- 
sponse of the second system, is also above its true level. In 
this way the comparison of both systems is less distorted. 

Antithetic variates create a negative correlation between 
two runs of the same system by using the vector of random 
numbers {zc} and {l-c}respectively. So if one run gives a re- 
sponse, say, above its true level, then the antithetic compan- 
ion run gives a response below the true level. On averaging, 
these deviations compensate each other approximately. 

Now consider Table 1. Column (2) shows that system 1 is 
simulated using antithetic variates. This technique creates 
the desirable negative correlations between xy and Xo» Xz and 


X4, etc. In the same way column (4) shows that antithetic 


290 Jack Kleijnen 


variates are applied in the second system, creating negative 
correlation between y, and Y2> Yz and Y4» etc. Further, the 
ith row of Table 1 shows that the ith run of each system uses 
the same random numbers. So the desirable positive correla- 


tion is created between xy and Yy> X2 and Yo» Xz and yz» etc. 


Response Random Response 
numbers 
(3) (5) 


yy 
Y2 
Y3 
%4 
! 

' 


Table 1. Simultaneous application of common 
random numbers and antithetic variates 













x 


~“ 


a 


1 
2 
3 

a 


However, Table 1 also shows that the simultaneous use of 
both techniques generates undesirable correlations: there are 
negative correlations between Xy and Y2> X and Yy> Xz and Ya 
X4 and Yz» etc. These correlations are undesirable as, for 
instance, a high value of X, goes together witn a low value of 
Yz» So that the comparison of both systems is distorted. 

The effect of the various correlations (or covariances) is 
shown more rigorously in equation (2) below. We estimate 
the difference in system performance by 
(1) d = nx, - nt 
where, for convenience, we assume an equal number of runs per 


system. Then the variance of this estimator 1s given by 


Monte Carlo Techniques: A Comment Z9M 





var(d) = n “{n var(x;) + n var(y,) + , z cov (x; ,x,) 
(2) = 
+ y y covey) - 2 : j covery! 
ifj 

Besides the above theoretical evidence of a possible con- 
flict between the two variance reduction techniques, we obtain- 
ed some empirical evidence: We considered a queuing system with 
four service stations in sequence and we simulated two ver- 
sions of this system. The two versions differed in the param- 
eters of the four service time distributions. The system is 
described in more detail in reference [4]. The two versions 
we Simulated are the plans I and II in Table 1 of [4]. The 


following results for var(d) were found: 


Only common random numbers: An9 
Common random numbers and 

antithetic variates: Sr 
Only antithetic variates: ZL penO) 
Independent runs: Set 


We remark that both system variants simulated, are very 
Similar. If the variants differ more (e.g. the number of ser- 
vice stations becomes different), then it might very well 
happen that the above classification of techniques changes. 

So it is not wise to conclude from the above example that a 
particular technique is best in all applications. A tentative 
procedure for selecting in a particular case the best tech- 
Nique (or combination of techniques) has been described by the 
author elsewhere [3]. 

Dr. Moy has restricted his investigation of variance-reduc- 
ing procedures to "periodic queuing systems" and to ''scalar- 
valued" response functions. We shall examine briefly whether 
it seems possible to generalize his results. 

Instead of a periodic queuing system we may simulate systems 
whemesintenest a5 in the steady state: characteristics. in 
order to estimate the expected value of a response variable in 
the steady state, we shall use long runs. For, the longer the 


run the more we approximate steady state conditions. More- 


292 Jack Kleijnen 





over, replication of runs means that we have to go through an 
initial phase which phase is useless for estimating steady 
state characteristics. So the number of runs tends to be 
small in steady state simulations. 

The small number of replicated runs may give problems in 
regression sampling. For we have to split the runs into two 
groups in order to obtain estimates of the parameters bopt 
which are independent of the auxiliary variables Xj with which 
these bopt are to be used. If the number of replications is 
small, then there are only a few replications available for 
estimating bopt: 
wrong estimates of by 


This increases the probability of obtaining 
pt: Wrong estimates decrease the vari- 
ance reduction; nevertheless the estimates of the response 
remain unbiased. 

The same problem arises in importance sampling: The first 
part of the simulation is used to estimate the parameters Qo 
which dy are then applied in the runs of the second part. If 
there are only a few runs available Ay May be badly estimated. 
Again this decreases the variance reduction, but does not bias 
the estimate of the average response. 

Finally, we point out an advantage in steady state simula- 
tions: If m, the number of random numbers per run, varies from 
run to run, then problems arise with several variance reduc- 
tion techniques. In periodic queuing systems m depends on the 
events during the simulation and is not under our control. In 
steady state simulations, however, we can fix m at such a value 
that steady state conditions prevail. 

The second restriction of scalar-valued functions means that 
there are no multiple responses. Assuming a single response 
is realistic if we have to choose among various systems. For, 
in order to choose we need a criterion, i.e. we need one value 
per system. In management models this criterion may be total 
profit. In economic models finding a criterion is more diffi- 
cult; we may use utility functions like Fromm [2]. Anyhow, 


in practice and especially in the preliminary phase of an 


Monte Carlo Techniques: A Comment 293 





investigation we may be interested in several responses like 
waiting time of customers and idle time of machines, or unem- 
ployment and the balance of payments. 

Multiple responses give problems with importance sampling; 
they do not give problems with the other three procedures. In 
importance sampling we have to distinguish between two cases: 

(1) The responses react in the same direction on the 


random numbers T;- Still each kind of response gives a 


different value for topt according to Dr. Moy's estimation 
procedure represented by his equation (22a). It is ques- 


tionable whether a compromise Gopt would still result in 


variance reduction. 

(ii) The responses react in different directions on 
T;- For instance, while the waiting time of customers Yq 
increases with increasing random numbers Ti the idle time 


of machines Y2 decreases. 


If we define the vectors of random number Xo and Yr, as 
Li (Os05 656 50) 
lie GUE Re 518) 


then the above relations imply 


(3) ¥, @p) < y, (4) 
whereas 
(4) Y2(%o) > y2(4)) 


Therefore it is difficult to choose a parametric form for 

£ a8 the new probability density function of the random num- 
bers used in importance sampling. Should £* (x) be a decreas- 
ing or increasing function of r? 

A variance reduction technique discussed by both Dr. Hands- 
comb and Dr. Moy is stratified sampling. Both authors con- 
centrate on "proportional stratified sampling," i.e. take nj = 
P; n observations from the jth stratum, where p. is the prob- 
ability that an observation falls in the jth stratum and n is 
the total sample size. Dr. Handscomb points out the waste 


involved in having to distard an observation, once the re- 


294 Jack Kleijnen 





quired number of observations in the corresponding stratum 
has been sampled. Dr. Moy presents a method for determining 
beforehand from which stratum an observation comes so that no 
observations are sampled from a particular stratum once that 
stratum is full and therefore no discarding of observations 
is necessary. 

An alternative stratification procedure, however, would be: 
Do not fix ns (equal to P; n). Simply generate n runs, each 
of length m. Afterwards we determine the n values of Dr. Moy's 
stratification variable x in order to classify y, the response 
of the run. Then we calculate 

4 ao 

(5) P 5°73 = IP = 





j 
where n. depends on the outcome of the experiment. This esti- 


mator differs from the unstratified estimator. 


BBG Teg Ves Ti mays 
(6) SS 
n or pe ue 


as the experimental weights n/n in (6) are replaced by the 
theoretical weights P5 ain (C6 

Cochran [1, p. 135] states that the above stratuizeation- 
after-sampling "is almost as precise as proportional strati- 
fied sampling, provided that the sample is reasonable large, 
say > 20, in every stratum." Moreover, we see that this al- 
ternative procedure takes less computer time than Dr. Moy's 
procedure as no value of the stratification variable x needs 
to be sampled and no special procedure for generating a se- 
quence of random numbers with the required composition is 
needed. 

We want to make one more remark about stratification. Near 
the end of his discussion on stratification Dr. Moy states that 
"the advantage of stratification will be dimmished" if m, the 
length of the random number vector (i.e. the length of a run) 
is not constant. However, we wonder whether stratification 


will be possible at all! For instance, suppose that the largest 


Monte Carlo Techniques: A Comment 295 


value of m likely to be required is 750. So we divide the 
range of the stratification variable X in, for instance, the 
following five strata Sy (= ee iene. 3 8D)! 


[0,150), [150,300), [300,450), [450-600), [600-750] 


We sample a value for X from the last stratum, say, X = 723. 
Then we start using Moy's procedure for generating a vector of 
random numbers with the required composition, i.e. a vector of 
750 random numbers, 723 numbers being greater than C(0<C<1). 
However, after, say, 500 observations the simulation stops. 
The value of X is then approximately 


(7) (500/750) x 723 = 482 


So X belongs to stratum 4 instead of 5! It might look reason- 
able to place the response y in stratum 4 and to use the 
weight 


(8) Wee" P(x © SF im = 500) 


4 4 
But actually y was generated using the value 723 of X which 
was sampled from 


(9) BGe= kiXre S., m = 750) 


So we wonder in which stratum a response should be put and 
which weights should be used, if the length of the random 
number vector varies from run to run. 

At the end of the section on importance sampling in his 
equation (24) Dr. Moy suggests to estimate the mean response 
of the system by the estimator 


(10) Were tele wy) Z, (@ < Ww, <1) 


1 
Assuming that the weight Wy is "independent of the sampling 
results" Dr. Moy derives the expectation and variance of the 
estimator Z. However, in the next step Dr. Moy estimates the 
optimal value of Wy by 
n 


1 


n, + njtvar Zy var Z5) 


(qld) WW, = ji 


296 Jack Kleijnen 





So, the weight is no longer "independent of the sampling re- 
sults."" We shall briefly examine the consequences of the 
stochastic nature of the weight. 

It is easy to see that the estimator defined by (10) and 
(11) as still unbiased. For 


(12) B(2)'= FEE (2i\) )=92 uP) eae 
WwW W 
The variance, however, is given by 
(13) var(Z) = E[var(Z|w)] + var[E(Z|w)] 
W 

Equation (25) in Dr. Moy's paper gives var(Z|w). If we neg- 
lect the covariance term in that equation and use 
(14) var[E(Z|w)] = var[u,] = 0 
then (13) reduces to 
(5) var(Z) = E(w4) ‘var(Z,) + E[(1-w,)°].var(Z,) 
It seems obvious to estimate var(Z) by 

Dy ys Dh ep 
(16) Wy var(Z,) + (1-#,.) ‘var(Z,) 
However, (16) is a biased estimator of (15) as Wy and var(Z,), 
and Wy and var(Z,) are dependent. For (11) shows that a) de - 
pends on var(Z,) and var(Z,). 

It remains to be investigated how serious this bias is 
(which bias increases the bias caused by neglecting the covar- 
iance term). This bias can be excluded by determining the 
weight on apriori information or by using a part of the total 
sample to estimate Wy and the other part to estimate var(Z,) 
and var(Z,). 


Bibliography 


1. Cochran, W.G. Sampling Techniques. 2nd ed. New York: John 
Wiley & Sons, 1963. 


Monte Carlo Techniques: A Comment 297 





2. Fromm, Gary. "An Evaluation of Monetary Policy Instru- 
ments,'' paper presented at the Econometric Society Meet- 
ings, San Francisco, December, 1966. 


$. Kleijnen, J.P. "“Inereasing the Reliability of Estimates 
in the Simulations of Systems: Negative and Positive Cor- 
relation Between Runs," private circulation, Tilburg 
School of Economics and Business Administration, Nether- 
ands, Apna. 1968 


4. Naylor, Thomas H., Wertz, K., and Wonnacott, T.H. ''Methods 
for Analyzing Data from Computer Simulation Experiments," 
Communications of the ACM, X (November, 1967), 703-710. 


The Value of Sample 
Information 


Robert H. Hayes, Harvard University 


Introduction 


In this paper we will focus our attention on the simplest 
experimental design problem in a Monte Carlo simulation ex- 
periment: the decision of how large a sample to take (i.e., 
how long to simulate), given certain assumptions about the 
probabilistic nature of the process being sampled and the 
experimenter's loss/reward environment. 

Implicit in our analysis is the crucial assumption that 
the experimenter is willing to use the values generated by 
the simulation model as his basis for choosing, from among 
a prespecified finite set of allternatives, a future course ot 
action or an hypothesis to guide his future thinking about a 
certain situation. (Hopefully this wording insulates us 
against any contamination from such problems as "validity" 
and "'accuracy."’) Moreover, we assume that this decision will 
be based on the value of a single criterion (a criterion 
which might, of course, represent the functional incorpora- 
talon of Several (criteria). So resitmicted> oun sprobilcnmene= 
duces to the classical one of choosing "the best of several 
processes,'' which has generated a substantial literature (see 
{1] for an historical account of the problem and its variants). 
We will adopt the "Bayesian approach" which, in the case of 
this particular problem, is identified with the names of John 
Pratt, Howard Raiffa and Robert Schlaifer (see [6] and [7]). 

By thus confining our scope we will be turning our backs 
on many of the crucial aspects of the total experiment. Such 
fundamental questions as ''Is the model a useful (if not ex- 


act) depiction of the actual situation?" (containing such 


The Value of Sample Information 299 





loaded corollaries as "Does the computer model do what the 
experimenter thinks it does?"), "Do the decision-alternatives 
being investigated contain the optimal one - or even a good 
Cnet else the senlteriloneayvaluid sone We sand Sls athe whoille 
thing worth the cost and effort?" will be avoided. Nor will 
we concern ourselves here with some of the more sophisticated 
and powerful sampling procedures, such as multiple stage, 
sequential, or stochastic search. Finally, although in some 
theoretical senses this "simple" problem is "solved," there 
remain problems of a computational nature that can impair the 
practical application of the theory. 

By such disclaimers we do not, however, want to give the 
impression that this approach, and the theory it has spawned, 
are of minor importance and usefulness. On the contrary, not 
only can the theory be directly applied in many cases (even 
when a stochastic search procedure is called for, for example, 
it often provides a useful "end game"), but the approach alone 
provides a rich source of insight into a variety of experi- 
mental design problems. The nature of most simulation exper- 
iments, in fact, makes its application more defensible there 
than it might be in other types of experimental conditions. 

We shall begin with a brief exposition of the basic theory, 
then try to indicate the circumstances that would permit the 
relaxation of certain constraining assumptions and facilitate 
the implementation of the theory, and finally report on some 
of the recent research in this area that appears to be par- 


ticularly relevant. 


An Example 


To focus our discussion we will utilize the following model 
situation: a company must expand its production capacity for 
a certain product and is considering various alternative loca- 
tions for a new plant. The economic attractiveness of each 
location will be affected principally by the demand pattern 


300 Robert H. Hayes 


for the product (both across geographic areas and over time), 
and to a lesser extent by the evolution over time of the wage 
rate in the area. The company has narrowed its choice of 
locations down to a few most promising candidates, and has 
decided to use the discounted present value of the net revenue 
stream as its criterion (at a prespecified discount rate and 
over a prespecified number of years). 


The Case of Two Alternatives 


Let us begin with an analysis of the simplest case: the 
choice among only two alternative locations (call them A 
and B). We further simplify the problem by assuming that the 
marginal distributions of the (unknown) discounted present 
values associated with the two locations can be modeled ade- 
quately by two independent normal data-generating processes 


having known variances but unknown means: 


(1) ee c= eet 


ae és ai (Geebe Sy /N5 18) 


where XoG represents the present value observed in connec- 
tion with location a on trial i (we adopt the 
convention of using overdrawn tilde's to desig- 
nate random variables), 

u is the unknown mean value of Xd and represents 
the combined effect of "demand" and "wage rate" 
(we also adopt the “personalistic"' view of prob- 
ability, which enables us to call any unknown 


uantity a ''random variable"), 
q y 


Me 


and is an error term obeying the usual assumptions: 


ai 


Bei 2 2 Z 4 
E44 7N(0,9,) > with on known, and Ee ai °R; = 0 


For the time being we will also make the important assumption 
that the company is "risk neutral", i.e., willing to base its 
decision solely on the expected values Ha and Hp. 

Finally, the company is able to express its uncertainty 
about (Uy up) in the form of a bivariate normal "prior" dis- 


The Value of Sample Information 301 





tribution with 


ae 
< LR - re at 
expectation vector + = iho Mee 
ue Be = 


(2) 7 te! fe 
and covariance matrix var{ 5} [rs | =a 
B = 


AB 'BB 
It will also be convenient to define the "precision matrix" 
Of this distribution: 
ln) =: 

Exactly how this prior distribution has been assigned will 
depend on the situation. In some cases it may be assessed in 
light of a substantial pilot sample; in others, based on more 
subjective information such as that gleaned in the process of 
"debugging" the program; in still others it may be purely sub- 
jective. In any case we take the full-fledged subjectivist 
point of view and assume that it exists, and can be revealed 
through intelligent questioning. In fact, the careful pre- 
analysis required in most simulation studies, together with 
the opportunity for validation runs and controlled experi- 
mentation, probably makes the assignment of a prior distri- 
bution easier in these circumstances than it generally is. 
Moreover, if one is willing to assume that the size of the 
sample to be conducted is so large that the information pro- 
vided by it will overwhelm whatever information is contained 
in the prior (as is often the case), one can assume a "flat" 
prior (H'=$) and avoid the whole problem altogether. Alterna- 
tively, one might want to arbitrarily test each process 100 
times (say) and base his prior on the sample statistics so 
generated. This would amount to a (suboptimal) two-stage 
sampling procedure, which may very likely turn out to be bet- 
ter than a one-stage plan based on a flat prior - particularly 
if there are more than two processes. 

We now partition our analysis into five segments: 


302 Robert H. Hayes 





What Action Should be Taken, and What Will be its 
Expected Consequences, if no Sampling of Xj Values 
were to be Conducted? 

The essence of this problem is captured by the decision 
tree given in Figure 1. Clearly the optimal selection at 
this point in time is that process whose prior mean is the 
larger. We will denote that process (location) by a, and its 


expected value by ji}. 


~ 


HB 





(decision) (outcome) (value) 


Figure 1. Decision tree for analysis 
of prior problem 


What is the Expected Value of Perfect Information (EVPI) 
About the Processes? 


One can gain additional information about the processes, 
at a price, by sampling. In this context "random sampling" 
refers to the drawing, with replacement, from the population 
of demand-wage histories contained in the simulation model. 
A single sample outcome may be the end result of a number of 
Monte Carlo random event selections - as in our current ex- 
ample, or when the "process" being evaluated is a decision 
rule. In guiding his thinking about the various sampling 
plans available to him, therefore, the experimenter might 
well be interested in the most that he should be willing to 
pay for such information. 

The most advantageous position for the experimenter to be 
in would clearly be one in which he could observe Uy and Up 
directly (without error) before making a choice (in the con- 


text of our example this is roughly the same as taking a very 


The Value of Sample Information 303 





large sample). This possibility is equivalent to "reversing 
the) tree’ in Figure 1 (see Figure 2a). Taking the expecta- 
tion of these optimal selections over the prior distribution 
of Coen.) yields the "expected value (of the decision) con- 
ditional on the availability of perfect information about 
both processes.'' The Expected Value of Perfect Information 
(EVPI) is then equal to the difference between this ideal 
value and the value associated with the choice dictated by 
the (imperfect) prior distribution: 


=~ =a = i X sf 
(3) EVP LS Eetninet baste)! = Ux = ET max{ (Uy ,Up) Le}. 


In the bivariate case the EVPI can be obtained more simply 
by calculating (assuming, for expository purposes, that a, 
us) Locataon A) the expeetation of the piecewise linear func- 
tion [max (0, Up-Ha)] over the distribution of 6 = (ip-Hy) 5 
see Figure 3. Since 6 is normally distributed 


with E6é = 


Up 5 Ua» 
(4) er v 
aid NS. = Wie = pans ZU 


this expectation is easily obtainable from tables of the 
"Normal Linear Loss Integral" [7, p. 356]. 

It should be noticed that within this conceptualization of 
the problem it would also be possible to talk about the EVPI 
associated with demand only. In this case the act fork in 
Figure 2a would occur between the demand fan and the wage 


rate fan; see Figure 2b. 


What is the Value of the Information Contained in a 
BGivenecaniplenOutcome:) sl) =) Ff nl.) 4 

Up to this point we have been subjectivist but not Bay- 
esian. (We might similarly classify the “empirical Bayes 
approach" of Herbert Robbins as Bayesian but not subjectiv- 
ist.) After conducting random samples of size (ny np) on 
the two processes, however, we can utilize Bayes' theorem to 


translate our prior distribution (2) into a posterior distri- 


304 Robert H. Hayes 


Decision Tree for Evaluation of Perfect Information 


(all uncertainty removed) 


max (iy stp) 





(outcome) (decision) (value) 


Eagumes Zar 


Decision Tree for Evaluation of Perfect Information 


(only uncertainty about demand removed) 





(outcome: (decision) (outcome: 
demand) wage rate) 


Figure 2b. 


Callculation of “EVPT 
(6 = Up i; Uy) 


~ max (0,6) 





The Value of Sample Information 305 





bution (2) into a posterior distribution on (iy, tp). The 
sample likelihood function in this case is Gaussian, speci- 


fied by the parameters: 


My 1 ny 
Sale as Whence , Xoa? 
B Qa T=] 
(3) on/ny 0 
and V = ; 
a 0 o@/n 
Beaman 


ve 


The posterior distribution on (Hy stp) is therefore also bi- 


(with, again, H = V 


variate normal with posterior parameters (see [6], Ch. Dee 
BSS): 


precision matrix OT pai eps 
(6) covariance matrix Vie CH) = 1 
and mean vector Riles aS (EM Die 


Given the sample outcome (sufficient statistics) Ny» Np 
and M, our posterior distribution may be sufficiently differ- 
ent from our prior distribution to change our location deci- 
sion, and in so doing will change our expected return. De- 
noting the optimal choice under this posterior distribution 
by a,, and its expected present value by uy, we can define 


the "conditional value of sample information": 
(7) CVSI (ny »Mp »M) 7 Ux x 7 le 


We compare posterior means because our previous decision must 
be reevaluated in light of what we now know about the pro- 
cesses. If, of course, our decision remains the same as be- 
fore, no value can be attached to the sample information 


(@x% = G implies wi, = pit). 


What is the Expected Value (Prior to Sampling) of the 
Information Contained in a Given Sampling Plan? 


Before we take random samples of size ny and Np on the two 


306 Robert H. Hayes 


ON 


processes, the sample means and their implied posterior par- 
ameters are both unknown, of course. We can, however, cal- 
culate the CVSI for any conceivable sample outcome and take 
its expectation over the sampling distribution of M. The 
result will be an "expected value of the sampling plan con- 
ditional on the values of (Uy spy Mp)". Expecting this 
function over the bivariate prior distribution (2) leads fin- 
ally to the "expected value of sample information contained 


in samples of size ny and np, prior to their observation": 


(8a) EVST (mgsng) = Ey Eig (Wea) - BE CDT. 


Formula (8a) can be expressed in two alternative and con- 
ceptually useful ways. First, since the prior expectation 
of the posterior mean is equal to the prior mean, we can 


write 


(8b) EVSI (ny Np) = Ee EM| it {max (ay (M) , a (M) } =) Wide. 

If we then carry through the expectation of M given ii over 
the prior distribution on (Uy Up) we obtain the unconditional 
distribution of M, which leads us directly to the “prior das= 
tribution of the posterior mean (vector)". (This distribu- 
tion is referred to as the "preposterior" distribution in 
[6,7, and 8]; there is one school of thought that dubs it, 
instead, the "preposterous" distribution! ) 

By well-known theorems [6, ch. 23A] these values, denoted 


by Gi wee are also bivariate normal with 


u uy 
expectation vector E Ml A | = Mee 
=n inal. 
(9) bB bB 
and 
" 
eA 


covariance matrix V 


I 
lis 
t 
SI 
il 
il< 
i 
li< 


where (M',V') are defined in (2), Ht aay (5) amie ee (6). 


The Value of Sample Information 307 





Hence we have 


(8c) EVST (ny,ng) = Es {max (uy ,ug) - ust. 


Under this formulation, therefore, the calculation of 
EVSI (cf.(3)) requires simply that we integrate the same loss 
function as in Figure 2 over the distribution of Y = (uy - 
Ww rather than 6 = (ip - Uy). Again sthas ica leulation! as 
obviated by tables of the Normal Linear Loss Function. Al- 
ternatively, the values of E{max (0, t-k)} can be calculated 


from the normal density and cumulative density functions: for 
t = N(0,1) 


Ly(k) = fy (t-K)py(t)dt = py(k) - k{1-Py(K)]. 


What is the Best Random Sampling Plan? 


At this point we must define a sampling cost function 
c(ny np) and compare it with EVSI (ny ,np) to obtain the "ex- 
pected net gain from sampling": 


(10) ENGS(n, ,np) = EVSI(n, ,np) - c(ny np). 


The optimal random sampling plan can then be defined as 
the pair (nj np) that 


(11) maximize ENGS(n 


np) 
AR 
(ny Mp) 


subject, possibly, to a budget constraint. 

For general c(.,.) functions this optimization can only be 
achieved through a search procedure. (Search procedures 
utilizing estimates of the partial derivatives (3/8n,) ENGS(n) 
are probably quite efficient in the case of independent normal 
processes; the presence of fixed costs or other discontinui- 
ties in c(n) complicates this search, of course.) When the 
sampling cost is linear: 

(12) e(n,.0_) = YK. + legny =k, t €,n, > K * €,0); 


(my = 0 implies Ky = 0, 


308 Robert H. Hayes 


where the Ky represent the fixed costs of sampling, and are 
usually large in comparison with the cy) as it usually is in 
simulation experiments, an analytic answer is obtainable for 
this simple two-process decision. It is not, however a 
straight-forward procedure and since this bivariate case is of 
rather limited interest we will not go into it further here 
(see [7, §5.10] for details), except to mention that when the 
prior on (Hy rtp) is assumed to be "flat" (H' = o); nh and nf 


should optimally be in the ratio 


* 
n 0 € 
Gy) a = 
A A B 


This facilitates the search process, limiting it to a single 


dimension. 


Relaxing the Assumptions 





The easiest assumption to relax in the context of a simu- 
lation experiment is probably the normality assumption. The 
sample likelihood function (as regards information about the 
sample means) will be approximately Gaussian, with parameters 
indicated by (5), as long as the ene are mutually independent 
and (ny »Dp) are large enough to permit reliance on the central 
limit effect. Large samples are generally the rule, of course, 
since the cost per trial is usuallyquite small in comparison 
with the opportunity loss of an incorrect decision. 

The problem of unknown variances is not as easy to dismiss. 
Some results in this direction are given in [7, ch. 5B] under 
the assumption of normal processes and a "'Normal-Gamma" joint 
prior distribution on (i, 62), where the variance ratios oo /o° 
are assumed known but their "level" o? is unknown. Gordon 
Kaufman at M.I.T. has investigated the more general case where 
the entire covariance matrix E = E(€,€,) is enna [3]; some 
analysis is possible here due to the fact that 2 has a 


Wishart distribution. 


The Value of Sample Information 309 





A considerable amount of numerical and empirical investi- 
gation has led Arthur Schleifer at Harvard to suggest that in 
many cases "nearly optimal" sample size decisions will be 
made if the unknown variances are simply replaced by their 
estimates from a sample,and known-variance theory is applied. 
(John Pratt provides a proof of this asymptotic behavior under 
very general conditions, for a single process, in [5].) His 
investigations were primarily in connection with the "'two- 
action problem" in which only one process need be sampled, 
however, and this approximation may not be so effective when 
Many processes are under consideration. Large enough sample 
sizes will overcome most problems, of course; the difficulty 
is knowing what is "large enough." 

George Tiao at Wisconsin has obtained some very provoca- 
tive theoretical results [9] in this area, using the family 
of "power" distributions (of which the normal is a special 
case) to test departures from normality, and a prior distri- 
bution on ee to test departures from known variance. Clearly 
more empirical investigation of the robustness of this pro- 
cedure is called for, however. 

The assumption that the Eada are mutually independent ap- 
pears to be almost non-relaxable, unless we wish to plunge 
into a philosophical morass. On the surface, interdependence 
doesn't appear to be that much of a problem. The prior-post- 
erior (6) and preposterior (9) relationships remain valid 
when V is not diagonal, and Kaufman's results [3] provide the 
necessary theoretical underpinning for the more general prob- 
lem of unknown covariances. The search procedure in ENGS is 
complicated because it can no longer be directed along the 
axes (since each sample observation provides information about 
all the processes, it is inefficient to just change one n, at 
a time), and there is some reason to believe that the ENGS 
surface is rather "bumpy". But the primary complication is 


due to the possibility of combinatorial sampling plans. 


310 Robert H. Hayes 


Assuming that the Cqs are interdependent means that they 
are interdependent at a given replication of the experiment 
(that is, in a given simulation run BE GER # 0). However, 
they are not interdependent across runs, if we assume truly 
random event generation. Hence the "optimal" sampling plan 
might be ''test both locations on 400 trials today, and then 
test location B an additional 400 trials tomorrow", which is 
a quite different plan than "test location A 400 times today 
and location B 800 times tomorrow'' even though in both cases 
aL ei 400 and Np = 800! 

This is unfortunate, because there are many situations in 
which interdependence cannot be assumed away. Later we will 
look at a type of sampling plan in which it is possible to 
deal with dependent error terms. 

The assumptions of a single criterion and linear utility 
are not particularly constraining. The criterion could easily 
incorporate some measure of the variability of each location's 
present values (of the form x = [E(v) - k V(v)], for example) ; 
similarly the company could base their decision on the expecta- 


tion of an assessed utility function [x = ir ee 


Choosing the Best of r Process (r > 2) 


Expanding the choice to include additional processes does 
not complicate the problem conceptually. EVPI, EVSI, and 
ENGS are defined as before, the sample statistics (5) are 
calculated (expanded, of course, to include the new processes), 
and the same prior-posterior and pre-posterior relations hold. 
What does become a problem is the computational burden imposed 
by the necessity of integrating over (r-1) - space in calculat- 
ing EVPI and EVSI, and searching in r-space for the optimiz- 
ing sample size vector n*. Unfortunately, no exact numerical 
techniques now exist for performing the integration (8c) in 
more than two dimensions (i.e., three processes), although 
the availability of good approximations to bivariate normal 


The Value of Sample Information Sil 


"wedge probabilities" suggests that it is probably possible 
to extend the analysis to four processes without too much 
difficulty. 


Eel 


liiewesitrmatton Ot eEVil simot too onerous. | elt sayspilort 


sample of size ny has been conducted the EVPI can be approxi- 


mated by 
n 
a ae =: 
(14) EVPI = = ) Zs - eae 
owe 
where Zz; = max (XyG> Xpir tee x4) 


The only fly in the ointment here is that the variance of the 
Z5 is likely to be substantial, so that ny might have to be 
quite large before the sampling error is brought within man- 
ageable bounds. (See, for example, Gumbel's discussion of 
this characteristic in [2].) Rough upper and lower bounds on 
EVPI can be calculated quite simply and are described in [6, 
ch. 23]. How good they are for large values of r has not 
been investigated; since, however, the main purpose in cal- 
culating the EVPI is to establish a frame of reference for 
evaluating alternative sampling plans (and, possibly, to 
bound the search for n*), great accuracy is probably not es- 


sential. 


EVSI 


No simple way of approximating EVSI is available in gen- 
eral, unfortunately. In [6] Pratt, Raiffa and Schlaifer con- 
cluded that the only feasible approach was to use a Monte 
Carlo analysis (separate from the simulation experiment that 
provides the context for this problem) to estimate the in- 


tegral Een max{ (u " mt): An experimental computer 


oly» 
program has been written at Harvard which searches for the 


optimal n* values using a Monte Carlo approximation to EVSI. 


51:2 Robert H. Hayes 





This method is somewhat suspect here, however, for the 
same reason as*before: the high degree of variability occa- 
sioned by the "max" operation. In a recently completed Ph.D. 
thesis at M.I.T. [4], C.Y. Lin offered some disturbing evidence 
to support this suspicion. For 80 "representative" three-pro- 
cess test problems he compared the optimal sample sizes n* 
with those obtained via estimates of (8c) based on 400 Monte 
Carlo trials and found substantial discrepancies. Moreover, 
the EVSI associated with these estimates of n* averaged about 
5% greater than EVSI (n*), the maximum percentage error being 
23.5%. (Lin also tested a more efficient "fixed fractile" 
sampling procedure that cut both the average error and the max- 
imum error about in half.) For more than three processes the 
Monte Carlo approximation is undoubtedly even less efficient, 
but there are no computational routines available now to check 
this. In any case, for reasonably large sized problems (r = 
9, say) the Monte Carlo approach appears to be hopelessly in- 
adequate; not only does the number of points requiring eval- 
uation explode geometrically, so does the number of Monte 
Carlo trials required to maintain efficiency at a given level. 
A better procedure may be to carve out groups of 3 processes 
at a time and use a highly efficient approximate analytical 
procedure that Lin developed to search for the optimum in 


some iterative fashion. 


Approximations to EVSI in Special Cases 


There do exist two rather plausible situations where the 
computation of EVSI does not appear too infeasible. One is 
when the process means are assumed independent a priori (V' 
diagonal) which will, of course, hold if a flat prior is as- 
sumed. The joint distribution of the posterior means then 


splits up into a product of univariate distributions: 


(15) PyCugs sees HYD aa PCa). 


The Value of Sample Information $513 





What we want is the expectation of the maximum of this joint 


distribution. Letting yn denote this maximum 


ng 
(16) ECs) >, Put <n) 
a=1 
we have, by (8c), 
7) EVSI = fo (my - WE) dP({™) = ia, [1-P(n)]dn. 


We have thus reduced an (r-1) integral to a single integral 
whose computation is relatively manageable. (Richard Meyer 
and John Pratt have both mentioned this simplification in 
private conversations; undoubtedly others are aware of it.) 
A search over r dimensions is still required, of course. 

The second case in which the computation of an optimal 
sampling plan is rather straightforward is fortunately one 
that is rather common in simulation experiments. That is when 
"comparison sampling" is used each decision alternative is 
tested against the same random event. (Most statisticians 
would refer to this as a "complete block design," I believe.) 
In the context of our example, this would imply that each 
Monte Carlo trial would consist of a single randomly selected 
"environment" (containing both a demand history and a wage rate 
history) in the context of which each location would be eval- 
uated. Not only is this a more efficient method of sampling 
in many situations, but for our purposes it simplifies the 
determination of the optimal sampling plan, which reduces to 
the search for a single number n*. 

imeathasmcase: Ehe Ee would almost certainly be interdepen- 
dent (otherwise comparison sampling wouldn't be attractive) 
but there would be none of the confusion we mentioned earlier 
in connection with the possibility of combinatorial sampling 
plans. In implementing this approach we would probably have 
to use the known-variance theory in conjunction with point 


estimates of the elements in 2. 


314 Robert H. Hayes 


The Sensitivity of EVSI to n 


Although we have stressed the computational difficulties 
associated with searching for the optimal n values, it is only 
fair to add some comments on what is known about the cost (op- 
portunity loss) associated with errors in these choices. The 
inferences that can be drawn from the evidence available in 
this regard are somewhat suspect, since most of this evidence 
has been culled from situations where only two or three pro- 
cesses were involved. As we mentioned earlier, Lin [4] looked 
at 80 "representative'' 3-process problems and found that es- 
timates of n* obtained through the use of an inefficient Monte 
Carlo analysis led to a 5% loss in EVSI, on the average, as 
compared with the exact n* values. Three-quarters of the los- 
ses were less than 10%. This suggests that EVSI is generally 
fairly flat near its optimum. 

Arthur Schleifer, in an unpublished investigation of two- 
stage sampling plans [8], offers a considerable amount of 
evidence to substantiate this conjecture. In particular, he 
showed that the EVSI under an optimal two-stage sampling plan 
was generally not too much greater than that obtained through 
a "double one-stage" plan: choose an optimal one-stage plan 
(as if there weren't going to be a second stage); then, using 
its outcome to update your prior, calculate the optimal size 
for an additional sample. Admittedly, most of his results are 
in connection with a two-action problem with linear economics 
(which is equivalent, under the proper definition of terms, to 
a two-process problem), so their applicability to large decision 
sets is questionable. We conjecture that the "smoothing effect 
of repeated averaging" will probably cause EVSI to be even less 
sensitive to departures from optimality as the number of pro- 
cesses is increased. Of course, the chance of large errors 


probably also increases. 


How Valid is the Model? 


One may well argue that if the preceding analysis had been 


The Value of Sample Information S75 


used on our example problem, the “optimal'' sample sizes that 
would have been suggested might have been unconscionably large 
-since the potential cost associated with making an incorrect 
plant location decision is probably magnitudes larger than 
the cost of running the 7094 (say) another hour. Although 
a number of empirical studies have indicated that EVSI gener- 
ally levels off quite rapidly as n increases (even for very 
large relative cost ratios), this does not rule out the pos- 
sibility of a "nonsense solution". 

If the sample sizes suggested by our procedure appear to 
be too large to the experimenter, the fault probably lies 
in one of two areas. First, his own intuition may have failed 
him (few people have really good intuition when balancing the 
costs of a sample against the information it provides). Second, 
and more likely, the prior-posterior theory we have developed 
may have overstated the increase in precision provided by a 
given sample size because it neglected the systematic measure- 
ment bias associated with the use of the simulation model 
itself. Model bias cannot be reduced by any sampling tech- 
nique, of course. It can, however, be explicitly taken into 
account in our analysis provided we are willing to make a sub- 
jective assessment of its potential magnitude. 

Returning to (1), let us assume that the mean value of the 


simulated process (u,) is composed of two elements: 


~ 


(18) ues Oy +96 
and 


Qa’ 


where 0, is the (unknown) mean value of the real 
process, in this case the "true" discounted 
present value associated with a warehouse 
in location qa; 


mR 


and is the measurement bias (independent of 6.) 


introduced by the simulation model. 


We now require that the experimenter be willing to describe 


316 Robert H. Hayes 





his uncertainty about the values the ae might have in the form 
of a multivariate normal distribution with expectation 0 and 
variance Va: (A treatment of the more general case where 
(u,8) have a joint multivariate distribution and E Bp 7 0 is 
contained in [6], chapter 23B.) The marginal distribution on 
6 is therefore also multivariate normal with 

expectation vector E {0} — Me; as Detoner 


(19) 


and covariance matrix V{0} = - V, = Vi ,. (Gapilyane ie 


' 
us =p =O’ 
(Aye 
It is this distribution which would now be employed in cal- 
culating EVPI. 

In order to analyze the impact of sampling (on the il, not 


VS + Va); where V' is defined in 


the 0) we must deal with the joint distribution of (6,n), 


ta 


which is multivariate normal with 


6 
expectation vector ef | 
ier 


(20) : 
9 Vie es 
covariance matrix Vis nae = i ae = Wt, 
u Ve a os 
= =§) = 
G! G} 
and precision matrix G!' = (wry) See eee 
— = ' 1 
Cia Eze 


After sampling we utilize the statistic H, as defined in Ch), 
to obtain the 


G} G! 
sont ; ed aL 1 
precision matrix GY = 5 
es 1 ! 
ven pene 
and covariance matrix Ve eevpete The expectation 


vector of this posterior distribution, which is not needed in 


the calculation of EVSI, is given by 


(a8) 


The Value of Sample Information Sal 


Hence the marginal posterior distribution on 6 is normal with 


“legs 2472 


Gj) °] 


C2) covariance matrix V{6} = [Gy - (G4, + H) 


i 
li< 


12}? 


The preposterior distribution of would then have 


expectation vector E{0} = M', as before, 


(25) hand i ; x 
covariance matrix V{O} = V6 - V5 
where V6 and V5 are defined in (19) and (22) respec- 


aya (ie ((99)))o 

Introducing § into the discussion thus increases the implied 
uncertainty about 8, bounds its posterior variance away from 
0, and reduces the economic value of sampling. The net effect 


to LOmIMNeeasie sthie EVP vand idecreadsieun=r 


Conclusion 


The classical problem of choosing the best of several pro- 
cesses, given a simple random sampling alternative, is a 
central one in simulation analyses. In many situations there 
are only two or three real alternative courses of action to 
begin with (to merge or not to merge, to bring out a product 
ORenOt es cOmbDUdld ta plantehene) ore thene)i,s tony. because "On 
the amount of detailed information that it is generally nec- 
essary to obtain about the alternatives before one can model 
them properly - so that the time constraints on the researcher 
automatically lead him to eliminate many alternatives on a 
priori grounds. Even when a multiplicity of alternatives makes 
a stochastic search technique more attractive in the initial 
phases of analysis, such a technique generally acts primarily 
as a rough screening device to direct the researcher's atten- 
tion to those few alternatives that merit detailed appraisal. 
In most of the important (monetarily) simulation studies we 


have witnessed there has come a point when the choice boils 


318 Robert H. Hayes 





down to a relatively few candidates, and the nagging question 
"have I sampled’ long enough?" arises. 

We have tried to sketch out the essentials of a theoretical 
approach of great power and elegance that appears to offer 
considerable insight into this question. Although the theory 
is built upon some rather shaky assumptions about the real 
world, the nature of most simulation experiments (in particu- 
lar the large sample sizes they foster) immunizes them to some 
extent against the breakdown of these assumptions. In most 
eases it seems to be) falxy to say that: 


1. the normality assumption can generally be supplanted by 
the central limit effect; 


2. unknown variances can often be replaced by point esti- 
mates from pilot samples and known variance theory used 
(whenthe primary concern is the comparison of two or 
more mean values); 


3. the search for an "optimal" sampling plan can often be 
simplified by the choice of a convenient subfamily of 
plans (such as in the case of comparison sampling), 
since there is some evidence to support the contention 
that the EVSI function is generally rather insensatave 
to substantial changes in n near its optimum; 


4, picking an "intelligent" sample size for a pilot sample 
and then optimizing over the second stage is probably 
quite efficient, in terms of EVSI, as compared with an 
optimal two-stage sampling plan (and almost certainly 
better than a one-stage plan); 


5. the possibility of systematic bias in the simulation 
model, while complicating somewhat the computations, 
can be handled within the same conceptual framework. 


Moreover, the opportunity for obtaining the type of informa- 
tion necessary to assess the prior distribution (2) (through 
validation runs, pilot samples, stochastic search filters, 
etc.) probably makes the applicability of the Bayesian approach 


more palatable in this context than in most other experimental 


situations. 
Bibliography 
1. Dunnett, C.W. "On Selecting the Best of k Normal Popula- 


tion Means," Journal of the Royal Statistical Society B, 


The Value of Sample Information 319 


(1960). 


2. Gumbel, E.J. Statistics of Extremes. New York: Columbia 
University Press, 1958. 


3. Kaufman, G.M. ''Some Bayesian Moment Formulae." Discussion 
Paper No. 6710, Center for Operations Research § Econ- 
ometrics, University Catholique de Louvain, 1967. 


a Lin C2. A Decrsaion Theoretic Approach to the Design and 
Analysis of Industrial Experiments." Unpublished Ph.D. 
drssiexrcataone Mellel) BLOO Ss. 


5. Pratt, J.W. “Bayesian Interpretation of Standard Infer- 


ence Statements," Journal of the Royal Statistical 
Society, B, (C965)... 


One Gate Wes ekaideta Ho. and Schlatter, Rs Unitnoductaion ito 


Statistical Decision Theory. New York: McGraw-Hill Book 
Comaelsiosr 


JRE far Hy.) andsschilaiter,, R. Applied Statistacaly Decision 
Theory. Division of Research, Graduate School of Business 
Administration, Harvard University, 1961. 

8. Schleifer, A. ''Two-Stage Normal Sampling in Two-Action Pro- 
blems with Linear Economics." Unpublished manuscript, 
Graduate School of Business Administration, Harvard Uni- 
versity, 1968. 


9. Tiao, G.C. "Bayesian Comparison of Means of a Mixed Model 
with Application to Regression Analysis," Technical Re- 
port #49, Department of Statistics, University of Wiscon- 
Sulit, II@s. 


Simulation Languages 


Howard S. Krasnow, IBM 


Introduction 


Simulation languages fulfill a variety of needs. They make 
programming easier; they provide conceptual assistance for the 
development of models; and they Support the carrying out of 
experiments. It can be observed that our general understanding, 
and hence our ability to provide useful facilities in simula- 
tion languages, decreases from the first of these needs to the 
last. 

Much is known about programming requirements: the need to 
shield the user from lower-level languages and structures, the 
need for good and complete diagnostics, the ability to execute 
with imperfect programs, the checking for validity of input 
data, freedom from concern with storage allocation, and so on. 

Simulation languages meet these needs, but their major con- 
tribution lies in the second area: providing the user with a 
world view, that is, a way of thinking about and describing 
the real world in such a manner that the simulation system is 
capable of accepting the description. This world view varies 
among languages, reflecting differences in focus and implying 
differences in the class of problems for which the language is 
best suited. Some of the important concepts which have been 
implemented include particle flow in queuing networks, dis- 
crete event sequencing, conditional activities, interactive 
processes, and block recursion. Simulation languages which 
embody these concepts are, respectively, GPSS [6], SIMSCRIPT 
[9], CSL [1], SIMULA [2], and SIMULATE [7]. 

The third need has received somewhat less attention in sim- 


ulation languages, perhaps because of the difficulty in defin- 


Simulation Languages S21 





ing reasonably general experimental approaches. In some cases, 
a particular mode of experimentation is partially implicit in 
the language, and in most cases some basic facilities are pro- 
vided. However, the experimental nature of simulation has 
served primarily as a motivation for the use of simulation 
languages because of the need for quick and frequent changes 

in the models being run. This characteristic of simulation 
has done much to stimulate the development of simulation lang- 


uages. 


The Structure of Simulation Experiments 


Despite the fact that experimentation is at the heart of 
simulation, relatively little attention has been given to the 
subject in the literature. Tocher [12] characterized simula- 
tion as a sampling experiment. This is a useful, although 
somewhat narrow, orientation. Others [4,10] have addressed 
specific problems in certain types of simulation experiments. 
Naylor [11] has called attention to the broader nature of the 
problem, and this symposium provides prime evidence of grow- 
ing interest in the subject. In order to discuss experimenta- 
tion from the point of view of simulation languages, it will 
be useful to develop a general characterization of simulation 


experiments. 


Context 


The context of a simulation experiment is indicated by 
Figure 1. A simulation study focuses upon some real world 
problem. Ideally, this problem is clearly identified and 
stated prior to development of the simulation model and exper- 
iment. For example, what will be the effect upon production 
of a specified change in delivery policy? In practice, there 
are several possibilities other than developing a specific 
model and experiment to shed light on a single problem. The 


best situation is one in which a fixed and closed set of prob- 


S22 Howard S. Krasnow 


Te 


lems is defined in advance to be addressed by the model and 
experiment. The result can be a somewhat generalized or multi- 
purpose model coupled with one or a set of experiments which 
are capable of answering the proposed questions. Unfortunately, 
the real world is often not understood sufficiently well to 
make it possible to pose all of the questions in advance. In 
this case, a set of questions or problems can be stated; how- 
ever, there is a need for providing a more or less open-ended 
structure to the model so that additional questions can be 
framed as the result of early experimental results, with the 
hope that the model may still prove relevant to these new ques- 
tions. The weakest situation, of course, is one in which the 
user has only a vague idea of the questions to be addressed. 
Although this may seem absurd, it is nevertheless true that 
some users proceed in this manner, often ending up with elab- 
orate and extensively detailed models which may prove of no 
value to any experimental purpose. The essential point is that 
the model exists solely to serve as a plausible and rational 
vehicle to relate the factors of an experiment to its responses. 
All else is costly window dressing which can only serve to 

blur the important relationships. 

No model should be built without an explicit experimental 
concept in mind, and conversely it is probably not feasible to 
fully define the experimental approach without some notion of 
the model structure upon which the experiment is to be con- 
ducted. Given a model and an experiment, the execution of the 
simulation study requires an iterative process of establishing 
an initial state for the model, providing input to the model, 
observing its behavior, and analyzing the results. The pos- 
sibility of redesign of an experiment, as well as the model 
itself, to accommodate new facts learned during the experimenta- 
EUONE Smee pmeS elitr 

Within this context, an experiment may be characterized by 
the following elements: Its purpose which reflects the manner 
in which the experimenter intends to relate to the real world 


Simulation Languages 325 





problem he is addressing; the factors, i.e., those variables of 
the model which will be controlled in order to observe some 
response which constitutes a measurement of interest; and the 
experimental process which determines the manner in which the 
factor levels will be varied so as to address the purpose of 
the experiment as completely as possible for as low a cost to 
the user as is possible. In this section we shall examine 
each of these elements in somewhat greater detail. In the 
following section we shall consider what faciltties are pro- 
vided by each of several simulation languages in order to sup- 
pore these ditterent elements’. 


Experimental Purpose 


Simulation experiments are conducted for a wide variety of 
purposes. Figure 2 suggests some of the purposes which are 
common, although this is not an exhaustive list. There are, 
nevertheless, some important implications that follow from 
the specific purpose of an experiment. 


Evaluating a system design may imply determining how good 
the design is in some absolute sense. In designing a tele- 
processing system, for example, the question is often asked: 
"What will be the system response to certain types of inquiries 
under specaried Nodad) conditions?” “The desare of the user 2s 
to obtain an absolute measure from the model which can be used 
as a good predictor of the real system response. This imposes 


a heavy burden of accuracy upon the model. 


Comparative results, as when different reordering policies 
are being compared under a particular inventory system, may 
well be valid in a relative sense, even though the absolute 
magnitudes of the responses are widely different from those 


that would be encountered in the real world. 


Verification suggests a desire to determine that the model 
performs in some sense as it is intended to perform. If the 
output of a model is expected to conform to some specified 


324 Howard S. Krasnow 

SS ea 
probability distribution, then the purpose of an experiment 
might be simply’ to verify that the output of the model does, 
in fact, conform to the expected distribution. Verification 
is a purpose that is often pursued prior to using a model for 
decision purposes [5]. 


Prediction may again involve the need to rely upon the 
absolute magnitude of experimental responses. The emphasis 
is upon estimating performance under some projected set of 
conditions, 


Sensitivity analysis is one of the most attractive and 


legitimate purposes of simulation experiments. In dealing 
with highly complex situations, the critical concern may well 
be with determining which of many factors are the most signif- 


icant determinants of overall performance. 


Functional relations. There is often a need to go beyond 
an analysis of sensitivity to the detailed determination of a 
functional relationship between one or more significant factors 
and the response. In corporate models, for example, the deter- 
mination of the nature of the relationship between profit and 
various forms of investment may provide more useful information 
for decision purposes than an estimate of absolute profit lev- 
els. If a maximum exists, the manner in which it is approached 


can be highly significant. 


Optimization is a purpose which is probably more widely 
desired than it is pursued. The determination of exactly 
what combination of factor levels provides the best overall 
response poses difficult problems for the experimenter. The 
use of heuristic optimization methods has been widely proposed, 
but the technical problems appear to be formidable and the 
costs may well outweigh the potential benefits [8]. 


Validation constitutes an attempt to relate the performance 
of the model to the know performance of the real system. This 
requires that the performance of the real system be known and, 
furthermore, that it is possible to establish the conditions 


Simulation Languages 325 
a EE EE ee ee eee 


under which that performance was attained so that the model may 
be compared with it. For example, validating a production 
model could well require reconstructing the circumstances under 
which historical production was obtained and providing the same 
sequence of orders and structural conditions that existed at 

the time the real performance was observed. This is again a 
case in which the absolute values of the response variables are 
to be compared with the real world. There remains the question 
of whether or not the model thus validated will continue to pro- 
duce legitimate results under new experimental conditions for 
which no real world data exists. Nevertheless, a well-validated 
model is far more likely to generate confidence in decision- 
makers than one for which validation was not or could not be 
achieved. 


Factors 


It is characteristic of simulation experiments that virtually 
anything in a model can be used as a factor in an experiment. 
It is desirable, of course, that factor levels be easily change- 
able either during or between simulation runs. The frequency of 
change is determined by the design of the experiment. While all 
factors are model variables, it will be useful to characterize 
factors in the following manner: 


Model structure. The structure of any simulation model con- 
sists of the real world components that are modeled and the 
interrelated description of the behavior of these components 
over time. For example, in a traffic model the components might 
be vehicles and highway intersections, and the behavior descrip- 
tion would characterize the movements of vehicles through inter- 
sections. In economic models, the components are often highly 
aggregated variables such as consumption, production, and cap- 
ital goods; and the description of behavior may specify the re- 
lationships which exist between these components based upon 
theory or empirical data. Quite often, the primary experimental 


factors are structural elements of the model. The comparison 


326 Howard S. Krasnow 


of two totally dissimilar storage devices in a given computer 
system might well require separate model structures to serve 

as experimental factors. Comparisons of operating decision 
rules, or planning policies, might require different behavioral 
descriptions to serve as the factors of an experiment. Good 
model design will seek to minimize the difficulty of establish- 
ing different factor levels during an experiment. It is us- 
ually assumed that factor levels will be changed only between 
simulation runs where the factor levels cannot be changed auto- 
matically during a run in accordance with a predetermined ex- 


perimental plan. 


The initial state of the model. The initial model state 
can be viewed as the values of all of the variables in the 
model at the point at which response measurement begins. This 
is often more complex than setting variables to predetermined 
values. For example, in a production model it could involve 
establishing the numbers and types of equipment to be used dur- 
ing this simulation run together with the queues of orders 
located at various parts of the production facility. Further- 
more, the determination of whether or not a valid initial state 
has been established may require considerable simulation 
coupled with the application of various testing procedures. In 
a model of a communications network, one may wish an initial 
state which reflects steady state conditions. If one starts 
with an empty system and imposes a steady state load, e.g., 
incoming telephone calls, it could take a considerable length 
of simulation time before all segments of the network settle 


down to a reasonably steady state. 


Model environment. It is not necessary in a simulation 
model to distinguish between the model of the real world sys- 
tem and the model of the environment of that real world system. 
Nevertheless, most simulations have these two major elements, 
although the manner in which they are modeled and the iti tectett 
culty with which they are modeled varies widely. In some 


models the environment consists of exogenous inputs to the 


Simulation Languages Bail 





model; that is, the input is dependent only upon simulation 
time and not upon the state of the model at that time. The 
use of exogenous inputs requires considerable knowledge of the 
state of the model over simulated time. It also provides a 
means for constraining the behavior of the model over time. 
For example, in an economic model concerned with evaluating 
fiscal policy, consumer price indices might be provided as ex- 
ogenous inputs. This time series could constitute an experi- 
mental factor and a particular time series would represent a 
particulan Vevel of the experimental factor: 

In many experiments the environment is represented by a 
stochastic input obtained by sampling from specified distri- 
butions. Either the parameters of these distributions or the 
type of distribution could constitute experimental factors. 
The interarrival time between messages arriving at a communi- 
cations network could be determined by successive sampling 
from a specified distribution. In this case the experimental 
factor could be the mean or variance of the distribution (if 
it were a normal distribution), or the type of distribution, 
for example, a normal as compared to an exponential. Whether 
the environment is represented by exogenous or stochastic in- 
puts, it can be based upon historical data or upon arbitrary 
data dependent upon the purpose of the experiment and the 


particular design. 


Responses 


The responses of an experiment are measures of performance 
which can be related to the real world system under study. 
Responses may be measures of costs or efficiency (such as re- 
source utilization); measures of service (such as response or 
delivery time); or measures of revenue (such as through put 
or production). Part of the difficulty in experimental de- 
Sign stems from the existence of multiple and perhaps conflict- 
ing responses. It is generally possible to improve service at 
some increase in cost, but the model may well have nothing to 


say about the value of the service improvement and hence the 


328 Howard S. Krasnow 





user must weigh multiple responses with respect to his problem. 
Any variable in the model can serve as a response variable. 
Of importance to the experimenter, however, is the nature of’ 
the observation that is to be made of the response variable. 
The response can reflect the model state at certain predeter- 
mined fixed times during the simulation or under certain pre- 
specified conditions, as when a given queue size is greater 
or equal to n. It can be observed as a time series in such a 
form that the result is tabulated or plotted as a function of 
simulated time. Graphical displays of response variables can 
be highly informative of model behavior. The response can also 
be some statistical evaluation of observations over time, e.g., 
the average inventory in the system during a time period, or 
the distribution of delivery time for products leaving the 
model system, or some measures of the distribution, such as 


its standard deviation or its spectrum. 


Experimental Processes 


The design of an experiment encompasses virtually all el- 
ements of simulation: the establishment of the model, the de- 
termination of the environment in which the model is to be run, 
the determination of factors of the experiment, the selection 
of the responses of the experiment. The statement of the pro- 
blem, and the determination of the experimental purpose and 
its relationship to the problem, are matters that only the 
experimenter can deal with. However, once the purpose is 
clearly established, the body of knowledge dealing with the 
design of experiments and with the representation of systems 
can be brought into play. Experimental processes deal with 
the manner in which factor levels are established and changed. 
See Figure 3. 

To illustrate, consider the establishment of the model en- 
vironment for an experimental run. Fairly general processes 
are available to assist the user in establishing: 


1) The type of variable. The environment may be character- 
ized either by stochastic or deterministic variables. 


Simulation Languages 329 
ee 


One of the attractive features of simulation is the es- 
sential independence of the model from the type of vari- 
able. For example, arrivals to the model from the en- 
vironment can occur at fixed or random times, and the 
change from one to the other can be made quickly and 
even automatically between runs. 


2) The type of analysis. A fundamental question is whether 
the model is to be analyzed in a steady state environ- 
ment or under transient conditions. Once again, drastic 
changes in the environment can be made with little or 
no need to change the model itself, by employing the 
appropriate process. 


3) The time dependency of the environment. A stochastic 
environment may be characterized as either stationary 
or non-stationary. It is often assumed that in a sta- 
tionary environment the model response will be station- 
ary. This assumption can have an important bearing upon 
the nature of the experiment and how it is conducted; 
however, it may be difficult to demonstrate that the as- 
sumption is, in fact, valid. 


4) The self-dependency of the environment. Dependency is 
generally a greater problem in analysis of responses 
than it is in establishing the environment. However, 
where dependency is desired in the environment (as when 
it is desired to input correlated variables), special 
processes may be required. 


Experiments with stochastic, steady state, stationary, and 
independent environments tend to be popular, and hence sim- 
ulation languages often provide processes to establish such 
environments. Generalized processes for establishing other 
factor levels, such as initial states, have also been devel- 
oped. However, there has been relatively little progress in 
integrating such processes into complete experimental designs. 
The determination of run length is another area in which basic 
processes are available in simulation languages, but the com- 
plex task of deciding how long to run is left entirely to the 


user [3]. 


Facilities In Some Current Simulation Languages 


Current simulation languages provide a variety of facilities 
for establishing the factors of an experiment, modifying fac- 


tor levels between experimental runs, specifying responses, 


330 Howard S. Krasnow 


and establishing the length of simulation runs. The determin- 
ation and implementation of experimental designs is left ex- 
clusively to the user. In the following sections, the specific 
facilities available in each of several languages are summar- 
Zed. 


General Purpose Simulation System (GPSS) 


GPSS provides an organized framework into which the user 
maps his conception of the system being studied. This frame- 
work is rich enough to support a number of built-in experi- 


mental processes. 


Factors 


Model structure - components. The components of a GPSS 
model are provided by GPSS and assembled by the user to repre- 


sent the system of interest. The basic components are transac- 
tions, whose individual movements or flow are controlled by 
blocks and which interact with or use facilities and storages, 
i.e., equipment. Some components, e.g., storages, are declared 
explicitly and are therefore fixed throughout an experimental 
run; other components, such as facilities, are declared implic- 
itly. In general, the maximum number of each type of compon- 
ent is established at GPSS assembly time and is fixed for the 
duration of an experiment. 


Model structure - behavior. The behavior of a GPSS model is 
determined by a block diagram consisting of a sequence of blocks, 
each selected from a fixed set of block types, into which a 
flow of transactions can be introduced between arbitrary points 
in the block diagram. The block diagram is fixed for the dura- 
tion of an experimental run, although many of the parameters of 
each block can be dynamically modified in accordance with the 
instantaneous state of the model. The block diagram itself can 
be modified between runs of an experiment by substituting, add- 


ing or eliminating blocks as desired. It is also possible, 


Simulation Languages 331 


through the use of a change block-type, to dynamically sub- 


stitute for individual blocks during an experimental run. 


Initializing the model state. External data is introduced 


into a GPSS model in two ways: by function follower cards, 
which specify the numerical values of a tabular function, and 
by initial cards which specify the values to be placed into 
save-value or matrix save value locations. The initialization 
of a model, however, often requires that transactions be as- 
sociated with equipment and queues. This is normally accom- 
plished by running the model for some initial period of time 

(an initialization run). At the end of this run, a reset control 
card is used to zero out all statistical attributes and the 
model is then restarted for an experimental run. Where the 
design of the experiment demands that several experimental runs 
start from the same initial state, a save and read capability 
is provided for saving the complete model state on tape. A 
clear control card is also provided which, in addition to re- 
setting statistics, eliminates all transactions from the model 


as a means of "cleansing'' the model for subsequent runs. 


Environment - exogenous input. A GPSS model will accept a 


stream of transactions which have been written on a job tape. 
The transactions enter the model in accordance with the sim- 
ulated times which have been predetermined for each. A job 
tape may be prepared from any desired source, e.g., historical 
data, or it may be prepared by another GPSS model employing a 
write block. 


Environment - stochastic input. GPSS provides a random num- 


ber generator capable of producing eight independent random 
number sequences. Integer or real random numbers may be used 
in arbitrary fashions by the user. Sampling is accomplished 
automatically when a transaction enters a block, one of whose 
parameters is a random function reference. 

Transactions are introduced into a model through the use of 


a generate block. The generate block provides for automatic 


332 Howard S. Krasnow 


sampling of inter-arrival time with provisions for offsetting 
the start of the generation process by an arbitrary simulated 
time interval. The size and certain other characteristics of 
the generated transactions are established at generate time 


in accordance with parameters specified for each generate block. 


Responses 
Response Variables. GPSS provides a set of standard re- 


sponse variables in the form of statistical entities and at- 
tributes. Data is maintained automatically for each queue in 
the model, including the maximum content of the queue since the 
start of the experimental run, the total number of entries to 
the queue, the time-weighted average contents of the queue, 

and the average time spent in the queue per transaction. The 
second type of statistical entity is the table which provides 

a means for constructing histograms and determining their stat- 
istical measures. Calculations are made automatically based 
upon tabulations accumulated throughout an experimental run. 
The class intervals, range, and table argument are declared for 
an experimental run, but may be redefined between runs. Stat- 
istical attributes are also associated with facility and stor- 
age entities, computing time-weighted equipment utilization 


and occupancy data. 


Model state. The state of the model can be printed out at 
any point during a simulation run, in terms of the standard 
numerical attributes provided. The values of all statistical 
attributes are automatically printed out at the end of each 
experimental run. These statistical attributes can also be 
printed out at fixed intervals throughout the run, but cannot 


be reset during an experimental run. 


Time series. There is no explicit provision for generating 
or printing time series in GPSS. However, this can be accom- 
plishedin part by accumulating tables whose argument Smet 
simulated clock. An output editor is available with which to 


control the format of the standard output. Tables and certain 


Simulation Languages 55 


other data can optimally be output in the form of printed bar 
graphs. 


Statistical evaluation. Much of the standard output men- 
tioned above consists of statistical measures accumulated over 
simulated time during an experimental run. 


Run-length. The length of an experimental run is determined 
by a start card which specifies the number of transactions to 
be terminated for the run. By controlling which transactions 
contribute to this termination count, it is easy to make the 
length of a run dependent upon either the simulated clock or 


upon some arbitrary model state. 


SIMSCRIPT 


Simscript is an event-oriented simulation language in which 
the user has extensive control over the organization of his 
model. 


Factors 


Model structure - components. The components of the system 
to be modeled are represented as entities whose definition is 
established by the user. A permanent entity is one of a class 
whose total number is fixed for the duration of a simulation 
run. A temporary entity is a member of a class whose number 
can vary during the course of a simulation. The attributes 
defined by the user apply to all members of a given class of 
entities; however, the numerical values of the attributes of 
individual entities (factor levels) can be dynamically changed. 
Entities may be assembled into arbitrary groupings called secs; 
however, the characteristics of each set, including the identi- 
fication of the class of entities that may belong (member 
entities) and the class of entities which may own the set 
(owning entities) are also defined by the user for an entire 


simulation run. 


334 Howard S. Krasnow 
a ee eee 


Model structure - behavior. The behavior of a system is 
described by a set of event routines, each of which is programmed 
by the user. During simulation, any number of events may be as- 
sociated with each event routine. The scheduling of each e- 
vent is performed with reference to the simulated clock, and 
may be accomplished dynamically for event routines defined as 
endogenous, but must be determined in advance for event rou- 
tines defined as exogenous. The event routines available to 
a model are established in a source program and fixed for the 


duration of a simulation run. 


Initializing the model state. The primary means for intro- 
ducing data into a SIMSCRIPT model is through the array initial- 


ization capability. The dimensions of each array, and hence 

the number of each type of permanent entity, are specified along 
with the input format and data values. A model may be rein- 
itialized in this sense between runs without the need for re- 
compilation. The form of the table for random variables (e.g., 
individual vs. cumulative probabilities) is also established at 
initialization time. Further initialization of the model is 
usually accomplished by event routines established for that pur- 
pose. Such routines would normally create and locate the de- 
sired temporary entities. Data for the initial attribute values 
of temporary entities can be generated from random variables, or 
read in from an input data tape. The model may, of course, be 
run for any desired period of time to achieve a proper level of 
initialization. The ability to record the initialized model 

and subsequently restore its status for other experimental runs, 
is present in some versions of SIMSCRIPT. However, this capa- 
bility is considered to be dependent upon the computer operat- 
ing system and hence is not a basic part of the simulation lan- 


guage. 


Environment - exogenous input. Event routines which are des- 


ignated as exogenous have associated with them an exogenous 
events tape. The absolute simulation time for the occurrence 


of each exogenous event is specified on the exogenous event 


Simulation Languages S35 





tape, along with data required by the associated event routine. 


Environment - stochastic input. A random number generator 


is provided as a source of random integers, equally distributed 
over an arbitrary range, or floating point numbers uniformly 
distributed between zero and one. Sampling from arbitrary prob- 
ability distributions is accomplished by reference to random 
attributes of permanent entities whose values are automatically 
determined when referenced, by table look-up procedures. A 

step function is provided for discrete probability distributions 
and a linear interpolation procedure is provided for continuous 
probability distributions. The tables associated with random 


attributes are input at initialization time. 


Responses 

Response variables. Response variables are defined by the 
user as required, usually in the form of temporary or permanent 
attributes. Automatic conversions of the clock are provided to 
facilitate references in terms of units such as days, hours 


and minutes. 


Model state. An extensive report generator is provided 
which permits the output in report form of any desired attri- 
butessoLvany entities. Report routines, written by the user, 
may be called from any event and hence may be output at any 


time. 


Time series. There is no automatic capability for accumulat- 
ing time series data. The user defines his own procedures for 
collecting data on a periodic basis and then outputs the results 


in a report of his own specification. 


Statistical evaluation. Two statements are provided to facil- 
itate the collection of statistical data. The accumulate state- 
ment makes possible the integration of specified variables with 
respect to time. This is useful in collecting data such as the 
average number of entities in a given set. The compute state- 
ment is used in calculating statistical measures with respect 


336 Howard S. Krasnow 


to a collection of data, e.g., the mean and standard deviation. 


Run length. The length of a SIMSCRIPT run is controlled by 
the use of a stop statement within an event routine and the 
scheduling of an event associated with that routine. 


SIMULA 


SIMULA is an ALGOL-based language such that ALGOL may be 
viewed as a proper subset of SIMULA. A simulation experiment 
or run is specified as an ALGOL procedure within which appears 
a SIMULA block. The model is defined within the SIMULA block. 
SIMULA adds to ALGOL the concept of simulation activities, of 
processes which execute an activity, and of sets of processes. 
(SIMULA 67 will provide additional extensions. This discus- 


sion concerns the current revision of SIMULA only.) 


Factors 


Model structure - components. The real world is represented 


as a collection of processes. A given process is associated 
with an activity block which constitutes the procedural de- 
scription of its behavior. In addition to declaring simple 

and array variables within such a block, the user may also 
declare sets. A set iS, im effect, a colllectionsorsprocessesr 
Hence, in SIMULA, the components of the system have behavior. 
The user need not cause all of the components of his model to 
be active in this sense, but he is encouraged to recognize that 
any component may, in fact, have behavior associated with it. 
The scope of the SIMULA block, within which may appear any num- 
ber of activity blocks, determines an experimental run. 


Model structure - behavior. The description of system be- 
havior is accomplished in the SIMULA language statements which 
appear within activity blocks. ALGOL procedures may be called 
as desired. The interactions within and between processes are 
effected through the sequencing set, which is a list of 


scheduled active phases (events) for all processes. Individual 


Simulation Languages SiSyi 





processes may selectively be activated, made passive, or ter- 
minated by means of scheduling and sequencing statements within 
the SIMULA language. 


Initializing the model state. SIMULA provides no special 


procedures for initialization. Data is input to a SIMULA pro- 
gram in the same manner as to an ALGOL program, primarily 
through the use of input procedures, e.g., read. The procedures 
for input and output of data are not standardized in ALGOL and 


are, therefore, specified for each type of machine. 


Environment - exogenous input. No explicit procedures are 
provided by SIMULA to effect the execution of processes exog- 


enously. The user would employ regular input procedures to ac- 


complish this objective. 


Environment - stochastic input. A mixed congruential random 


number generator is provided and serves as the basis for all 
sampling procedures. Procedures are provided for sampling from 
the following continuous distributions: uniform, normal, nega- 
tive exponential, and erlang. Discrete distributions provided 
are: binomial, equi-probable integer, and poisson. In addition, 
procedures are provided for sampling from tabular (cumulative) 
distribution functions, either discretely or by linear inter- 
polation; and for sampling from histograms generated during 


simulation. 


Responses 
Response variables. All variables in a SIMULA program are 


user defined, hence no standard response variables are provided. 
Instead, several procedures are provided to assist in collect- 


ing and analyzing simulation responses. 


Model state. One procedure is provided to assist in report- 
ing the current state of a model. This procedure computes and/ 
Or prints, based upon a specified real array of one or two di- 
mensions, any selection of the following items: mean, standard 


deviation, maximum or minimum value, and range, of the elements 


SoG Howard S. Krasnow 





in each column of the array; or correlation co-efficient be- 
tween the columns of the array. For all other data, normal out- 
put procedures are employed by the user. 


Time series. Time series data would normally be collected 
and printed by user-defined procedures. The SIMULA histogram 
printing procedure (see below) could be employed for the plot- 
tingof time series data. Specialized off-line plotting tech- 
niques have also been used, but are not generally available. 


Statistical evaluation. In addition to the array data anal- 
ysis mentioned above, an accumulate procedure is provided to 
assist in collecting the integral over time of selected vari- 
ables. A histogram procedure is provided for the purpose of 
updating paired arrays which the user wishes to employ for col- 
lecting distribution data. A companion output procedure permits 


the print-plotting of the content of the histogram arrays. 


Run length. The length of a SIMULA run is determined by 
the SIMULA block. Any exit from this block constitutes the 
end of the run. Since the SIMULA block may be imbedded in an 
ALGOL program, it can be entered repetitively to carry out 
multiple runs in an experiment. Alternatively, activity blocks 
within a SIMULA block can be used for reinitialization of 


multiple run experiments. 


Program SIMULATE 


Program SIMULATE is a FORTRAN-based system designed for econ- 
omic models employing block recursive equations. This orienta- 
tion permits the inclusion of a variety of services for estab- 


lishing, testing, and solving the equations of the model. 


Factors 


Model structure - components. All variables used in the 


model are declared in a variable list. No explicit distinction 


is made between variables which represent structural components 


Simulation Languages 339 


of the real world to be used as factors in an experiment, and 
response variables. The experimental interpretation is a user 


responsibility. 


Model structure - behavior. The behavior of the model is 
established by the set of block recursive equations written by 
the user. These equations are written as sums of functions of 
two variables, where the functions are selected from a fixed 
set of prescribed functions provided by SIMULATE. Any variable 
may be lagged by from zero to nine time periods. The blocking 
of simultaneous equations and their solution is accomplished 
automatically by the system. Facilities are provided for test- 
ing the validity of individual equations by computing residual 
errors based upon historical data. For experimental purposes, 
changes in factor levels are made between runs by adding, delet- 
ing or changing equations and variables. The solution of the 


entire system proceeds on a fixed time period calculation basis. 


Initializing the model state. Data is supplied in standard 


format by the user to set the value of all variables in the 
initial period and values for all previous periods back to the 


maximum lag for each variable. 


Environment - exogenous input. All exogenous variables are 


declared in the variable list. Data must be provided for such 
variables for each period of the simulation run. In addition, 
historical data for endogenous variables over the length of the 
simulation run can also be provided, either for the purpose of 
calculating residual equation errors or for comparing simulation 


output to real data. 


Environment - stochastic input. Where desired, exogenous 


variables can be perturbed by the imposition of random distur- 
bances. These disturbances are assumed to be normally distributed 
with mean zero and standard deviation specified for the vari- 
able. The user may specify either separate random number se- 
quences for each random variable, or the use of a common random 


number sequence for all random exogenous variables. Random dis- 


340 Howard S. Krasnow 





turbances can be suppressed for all exogenous variables for any 


simulation run, 


Responses 


Response variables. A special run may be made for the pur- 
pose of testing the fit of individual equations against histor- 
ical data. Residual errors are automatically computed. If de- 
sired, historical data for endogenous variables may be compared 
against the simulation results, and forecast errors (the devia- 
tion of simulation from historical data) are automatically 


computed. 


Model state. A variety of output options are provided to 
assist the user in tracing the solution of the model. For ob- 
serving the model state, the value of each variable is output 


at the end of each time period. 


Time series. Although the above output is effectively time 
series data, no provision is made for selective organization of 
the data in either tabular or plotted form. 


Statistical evaluation. The error computations indicated 
above may be obtained in a form which is normalized with re- 
spect to the standard deviation of the variable. No other form 


of statistical analysis of results is provided. 


Run length. The run length is expressed as a forecast hori- 
zon, in terms of the number of periods to be simulated. This 


is fixed for each run in the run-control card. 


Future Requirements 


The main thrust of simulation language development has been 
to assist the user in developing the structure of his model and 
manipulating his experimental factors. Support is provided in 
varying degrees for identifying, analyzing and displaying re- 
sponse variables. The establishment of experimental designs, 


the determination of relationships between the design, the en- 


Simulation Languages 341 





vironment, the model structure, and the responses, are entirely 
the wsier 5s) nesponsabalasty 

The principal need at this time is to extend the capabil- 
ities of simulation languages to more readily facilitate and 
support the conduct of simulation experiments. More attention 
must be devoted to considerations of how the model is to be 
used, in addition to the established concern for how it is to 


be built. Progress is required in each of the following areas: 


Experimental Design for Simulation 


The vast body of knowledge concerning experimental design 
should be applied more explicitly to the simulation context. 
The relationship between the environment of the model and the 
design of the experiment should be explicitly defined. Proce- 
dures for controlling run length should be an intrinsic part 
of any design. Designs should be packaged so that the simula- 
tion user, who more often than not has a limited statistical 
background, can select well-structured procedures with confi- 


dence. 


Modular Simulation Systems 


Simulation systems should provide standard interfaces be- 
tween experiments, structural components of models, model en- 
vironments, and output analyses. As a first step, it should 
be comfortable for the user to interchange such modules. For 
example, under the control of an experiment, a model might be 
run in several different environments; or several different 
experimental designs might be applied to a given model. As a 
second step, one would like to have available libraries of 
generalized experimental, environmental, and analysis processes 
from which to select for a model. It would be appealing to 
select an "Analysis of Variance" process, identify to it the 
structural components of the model which constitute the exper- 
imental factors, identify the response variables, and "press 


thew button ston the experiment. to began. 


342 Howard S. Krasnow 
—— 


On-Line Access to Experimental Facilities 


The use of terminals to access these facilities obviously 
enhances their utility. Extensive automation of experimental 
designs will be difficult to achieve. The demand upon the de- 
Signer of the simulation system, and the resulting constraints 
upon the user, can be reduced by providing for user participa- 
tion in the process. On-line facilities could be used to aid 
the user in selecting an appropriate experimental design, to 
assist him in establishing the experimental factors, to support 
the structuring of the model, to help control costs by monitor- 
ing large and lengthy experiments, and to expedite the display 
and analysis of response data. 


Data Acquisition 


Procedures for initializing models have been fairly well 
established. However, the collection of the data with which 
to initialize the model, or supply it with exogenous inputs, 
is currently outside the scope of the simulation system. Where 
external data requirements are large, as they often are even 
when ultimately reduced to statistical parameters for input to 
the model, some level of system support becomes essential. As 
large organizations increasingly construct extensive and com- 
prehensive data bases, it will become imperative to provide 
ready access to this data for purposes of simulation experimenta- 
tion. The problem here is one of interfacing with data which 
is collected and. maintained primarily for other purposes. Suit- 
able access could greatly improve the simulation process in 
many instances. 

Impressive progress has been made in the past to facilitate 
simulation modeling. Much work remains in order to provide a 
comparable level of support for simulation experimentation. 

This symposium has recognized the need and will hopefully pro- 
vide an important stimulus for future development. 


Simulation Languages 343 
I a aes Sear ee 


PROBLEM TO BE STUDIED 


DESIGN AN EXPERIMENT<@— 4 
DESCRIBE A SYSTEM 


BEHAVIOR 


START EXPERIMENT =f 


INPUT INITIAL MODEL STATE— 


COMPONENTS 


OBSERVE MODEL PERFORMANCE 


Figure 1. Context of simulation experiments 


344 Howard S. Krasnow 


EVALUATE 






VALIDATE COMPARE 





STRUCTURAL 
FACTORS 


RESOURCES 






VERIFY 








OPTIMIZE 
PLANS 
OPERATING RULES 
HYPOTHESES 
ESTABLISH PREDICT 


FUNCTIONAL RELATIONS 


ANALYZE 
SENSITIVITY 


Figure 2. Purposes of simulation experiments 


345 


Simulation Languages 


*¢ eInsTy 


See Ome me ml Oleclancl AE We IN BE JRE IE Rl Bl ele SI 





| Palit! 
es ii | 
rele i vy 
INSAWNOYW LANA 
THACOW 
NOT LVN TVA INdNI INdNI 
TVOILSILVLS QJILSVHOOLS SNONA DOXA 








YOTAVHAE 
SLNANOdWOO 
SH1IYas 


AWIL FYNLONULS TACOW 


ALVIS ALVIS 
SNOFNVINVISNI TIVILINI 


346 


HO 


iat, 


U2 


Howard S. Krasnow 


Bibliography 


Control and Simulation Language User's Manual. IBM Data 
Centre, London. 


Dahl, O.J., and Nygaard, K. SIMULA, Introduction and User's 
Manual. Oslo: Norwegian Computing Center, : 


Fishman, G.S. "Digital Computer Simulation: The Alloca- 
tion of Computer Time in Comparing Simulation Experi- 
ments,'' The RAND Corporation, RM-5288-1-PR, October, 1967. 


Fishman, G.S., and Kiviat, P.J. "The Analysis\ot weummees 
tion Generated Time Series,'' Management Science, (March, 
1967). 


Fishman, G.S., and Kiviat, P.J. "The Statistt¢smomg ecm 
crete-Event Simulation," Simulation, (April, 1968). 


General Purpose Simulation System/360 User's Manual, IBM 
H20-0326-0, 1967. 


Holt, €.C., et all. Program SIMULATE BAP A User's and 
Programmer's Manual, Madison: University of Wisconsin, 
Social Systems Research Institute. Copyright C.C. Holt, 
1965. 


Luther, E., and Markowitz, H. '"'SimOptimization Research, 
Phase II,"' Consolidated Analysis Centers, Inc., Report 
66-P2.0-1. 


. Markowitz, H.M., Hausner, B., and Karr, H.W. SIMSCRIPT: 


A Simulation Programming Language. Englewood Ciginisr 
N.J.: Prentice-Hall, Inc., 1963. 


Mechanic, H., and McKay, W. "Confidence Intervals for 
Averages of Dependent Data in Simulations," IBM Advanced 
Systems Development Division, TR17-202, August, 1966. 


Naylor, Thomas H., Wertz, K., and Wonnacott, Thomas H. 
"Methods for Analyzing Data from Computer Simulation 
Experiments," Communications of the ACM, (November, 1967), 
703-710. 


Tocher, K.D. The Art of Simulation. Princeton, N.J.: D. 
Vian Nostrand Cone mln crm gOS 


Distribution of Blocks 
of Signs 


Norman R. Draper and Willard E. Lawrence, 
University of Wisconsin 


The problem considered in this paper is that of obtaining, 
Ona coMmpubLer,. the exact distribution of blocks of ‘sagns! ot 
the same type for certain small grids of plus and minus signs. 
If these are known, the results can be used to examine and 
test the residuals patterns which arise, after regression 
analysis, from observations in similar grids. (Grids of this 
type occur frequently in, for example, geological and meteoro- 
logical work.) Details of application are given in the refer- 
ence at the end of the paper. 

Even for small grids the calculation of the exact distribu- 
tions is very time consuming. It would be desirable to find 
faster methods of doing these computations. Alternatively, 
approximate distributions might be obtained by sampling the 
exact distributions. A third possibility is to use simulation 
techniques to provide possible allocations of signs in the 
grids, and to approximate the distributions accordingly. All 
these possubalatves need further work. Thies purpose ot thas 
paper is to exhibit the problem, to provide a basis on which 
further work can be built and to give results against which 
future calculations can be checked. 

The motivation for considering the present problem has been 
discussed by Draper and Lawrence [1] and applications and ref- 
erences to allied work will be found in that paper. Our object 
here is to emphasize the computer aspects of the problem and 
to give the distributions which we have obtained. 

Suppose we have a grid with p rows and q columns filled with 


n signs of one type (plus or minus) and (pq-n) signs of the 


348 N.R. Draper and W.E. Lawrence 





opposite type (minus or plus) where n < pq/2. We shall define 
two signs of the same type to be in the same block if and only 
if exactly one of the following is true: 


1. The two signs are adjacent either horizontally or 
vertically (not only diagonally) 


2. The two signs are connected by a chain of adjacent 
horizontal and/or vertical links (not only diagonal 
links). 


Figure 1 gives some examples of the numbers of blocks which 
would occur in certain arrays under this convention. In two 


of these examples, the blocks have been indicated by dividing 


+ - - + - + 
= = + + + + [+] -|+ 


Blocks: 2 5 3 4 


lines. 


+ 


Figure 1. Examples of various numbers 
of bilocks an ar Siac 3) gad 


(Readers unhappy with this block convention should note 
that, even if blocks are differently defined, the appropriate 
distribution calculations can be carried out in similar fash- 
ion.) 

For our application, we need the discrete distribution of 
the number of blocks which can occur, given the values of p, 
q,and n. These distributions appear rather more difficult to 
obtain then one might expect, and the computational time needed 
escalates rapidly as p, q, and n increase. For this reason the 
work has been limited to obtaining the distributions for the 
combinations (p,q) = (3,3), (3,4),°Gs5)5 (Gyo), (4.2 ae 
and for all n < pq/2 in each combination. The exact fre- 
quencies and the appropriate means and variances for these 
combinations are given in Table 1, and the corresponding cum- 
ulative distributions are given in Table 2. 

We now briefly outline the method used in the computer pro- 


gram to generate the distributions given in Table 1, using a 


Distributions of Blocks of Signs 349 


3 x 3 grid for illustration. Consider the situation where 
there are four plus signs. We can imagine the rows of the 
grid laid out one after the other to form a single row of signs; 
for example the grid 

+ + - 

pe 

- - + 
can be written as ++-+----+. We can obtain all possible 3 x 3 
grids with four plus signs by finding systematically every pos- 
sible way of allocating the four pluses in the single nine 
element row (9C4 = 126 ways) and then reforming the grid each 
time. When p < 10, q < 10, we can number the grid square in 
the ith row and jth column as 10i + j. For the example above 
we then have pluses in squares 11, 12, 21, and 33 and minuses 
in squares 135, 22, 23, 31, and 32. We can now count the blocks 
formed by the pluses and by the minuses separately. Consider 
the pluses in squares 1, 12, 21, and 33. We can begin with 
11 and ask if any other “plus" squares are adjacent, i.e., 
dittem iby = 1 ons+ 10), inthis way we get a2 grouping (ll 12, 
21). We now examine 12 and 21 and see if any other plus 
squares are adjacent. None is, so that the block is complete. 
(If other squares had been added, they too would have been used 
to search for adjacent plus squares and so on.) We now remove 
the block (11,12,21) which leaves 33 as the only plus sign. 
There are, thus, only two blocks of plus signs in this arrange- 
ment of signs. Similarly we count the minus blocks, and so 
find the total number of blocks. The frequency count is up- 
dated and the next allocation of signs in the sequence is ex- 
amined. 

This work was partially supported by the Wisconsin Alumni 

Research Foundation; N.S.F. (Grant GP-7842); Imperial College, 
London; and the Marguette University Committee on Research. 


Bibliography 


1. Draper, N.R., and Lawrence, W.E. ''Residuals in Two Dimen- 
sions,’” Technometrics, XII (1970). 


ee EE EE EE EEE EE EOE EOE evel 








9S 86 FOZ PSS OIFT ISPS 9OL9 HZSZI PEOTZ OOLTE ILPRE Z9GOE OZTEZ BLTL OGP 9SLPST p88 PG9SeS8/TOTS6S POs 8L576/S6PL8S OT 
a : or 8I 9S 002 POS PSTT 9ZLZ OSLS Z8GOI POOGT IPHSZ OSHSE BIPPS PSLTZ 7889 Z8r OIGLIT OSOOSST88/LTL9ZSS67E S6607Z/Eeoz7ET 6 
Z $2 89 OST 98S TShI OSES LOOL 7Z66Z7T E8S0Z Z9ZLZ 699LZ ZOSBT FIGS 99H OLG6SZT SZ7ZOTIL96E/9LSEPSELEST S86Z9/8P678E 8 
OS OFT PSP OOZT OTTS 8SS9 BISTI Z2669T S668T PLPST OFOP OF OZSLL OOZ6LL00S/LL06E6ZTHI O026ZT/6r7rL L 
oz 00z Lee 0902 OT8h SZI8 Z890T 6858 SOTS LSE OOLBE OOLESZOST/TELLSELIZE O9LBS/EPEsoz 9 
gt 0972 ZOIT OTL2 ZOLb ZZ7Sh OG6ST OFZ HOSST prSSSce/69SSTLS 696/6b9b Ss 
2 ocr OST S87Z6T 6f6 PET SH8b Sf6PIST/SBLOLPT 72 /S9et v 
OL 86S 90b 99 OPIT SLOLZ7/ZT0ST S6/Tre £ 

p Sst Te O61 OOT9S/TZ6S O6T/SbS @=u 

OSB. SXp 
0 Ol RE 03 Z6Tt =8Tt Or6 Z6ST S822 OZTE O£F9Z YLZT OFT OL8ZT SZ7ES8TE/ZT68Z00T S6r/61LZ 8 
= ¢ 8 oT 9s pot pre 822 OOST p8IZ 8842 OFZ ZIZT SET OPPIT 06pP07/E7L609 fvt/Z7LL Ll 
z ez 9s ort 7Zb 918 SIpT Of£02 ZL6T 066 ZET 8008 OShLShI/T8Z879¢ p9e/S98T 9 
82 zIt ozs 89 8ZTT OOST OTL YOIT 89Eh 1878/SO0SST £T/T9 5 
92 861 Tor zL9 8fbp S8 O7Z8T OOPZTSS/TISTrés OZ8T/LPSZ P 
zs Oz O%Z 8b 09S 006¢/1L62 OL/Lb2 £ 

v z v2 (O2T ost/is 9/LT @=u 

ase. Xb 
Zz 0 O OZ Fe 26 262 $09 OSET P88Z 966b 898L POPOT 96HOT 860L OF7ZZ OLT O0798b SZOppLcpT/TZ1Ses6ts SSTZI/82obZ 6 
e ¢£T 02 7%@9 212 82S VOTT Ost OLSb YXPOL 2986 TS46 S999 ESTZ HOT BSLEP ZESOGLT/TZP669P 8S8/6LTS 8 
z 9¢ OL p8t 909 Zé6sT 992 O09Lb OT69 OF9L Y90SS 868T OST PZ8TS prcLT61TSSZ/SSTTS6zS ZI6ST/S9TZ6 ee 
% Tt cSt pcr pLIT 6££2 GS8S O86h OS6E TOST ZET HOST prol6Z8c/Stelp716 8819/67pEe 9 
t ze 8SZz 769 9SST 88hz2 8LE2 PSOT 90T 8958 87L629/¢6S99TT 871/869 S 
Z 8P 87e pcs TSIt 02 ZL 090¢ OSPOLTIT/LOT6SZT SOL/LLZE v 
z 89 00F 96z OS 9T8 ZLEET/LESL 89/SPrZ £ 

p Geile ices aU 60re2/PIZP Est/9sr @ au 

OSB OXE 
tT O v vl 92 z8 est 88¢ T9Z PSTtT Z6ST 6IvT T0ZL O08 SePr9 SZ0TO9b/PSS99PST SvIz/pdritt Z 
9 II 9¢ SIT vHZ £Ts p68 £ZL7T 9072T 829 94 S00S $Z00S0SZ/978ES9b9 S00S/LpLS2 9 
P SZ 98 Orz 6bP LLL 098 v6b 89 c00E 6008T06/89S9S8LT £00¢/0L7rT Ss 
z Ly prt Tze tlt PASI KI eye SZZE98T/9P8Z0HZ SOST/EpLs v 
Zz os L8T LLT 6E SSP SZ0L02/pOSLET ssp/619T = 

v 62 zz SOT §L9£/708 so/66 @=u 

SBD SX¢ 
Z 0 0 8T ze 8Y pst 802 SZ O6T 8£ #26 Trsz/evt9 TT/TS 9 
p 8 val os pot OLT 972 O8t 9f Z6L 96¢/598 7/6 S 
. Zz vz cP S6 09T 8ST pe S6P S70SpZ/99LS8¢ S6p/S£02 Y 
@ 7 OZ 88 Be O22, S$Z0¢/TLSZ SS/f6T £ 

v Sv Pat 899) 9SEp/LTZT 99/S8T Cyar iu 

ased pxe 
T 0 v zr £T OP Or OTe AO eae Trp /Z78Z L£/L2 v 
v ot vz oc oT 8 Trr/60S TZ/EL o 

v 02 zt 9f 18/Z£ 6/SZ @=u 

OZ 6T 8T ZT OT ST #T ST ZT Il or 6 8 L 9 s b £ Zz TeIOL SIUBTIEA uBoW eased exe 


addy auo Fo su8ts z/bd > u yyIM ptaz b x d e ut su8ts 
OXTLT FO SYDOTG FO LSqunu oy}. LOF sotduenbeiyz 29exXg “*T STqey 


000°T 66666° 66666° 
000°T 86666" 


02 


61 


000°T $6666° 


8T 


66666° 


LT 


896° cl6° P98" ZLyL> SES" LOSS 
716° 8£6° 48° 652° O06S° 8sls° 
T86° SS6° 668° 964° 72£9° 9Tb* 
766° £L6° L¢6* 258° ¢0L" HBr" 


000'T '666° VEGO S26 226 S625 188s. 


06666" L666° 7666° 866° S66 186° 
76666 ° 8666° S666° 866° $66° 886° 
O00°T 86666° 8666" 7666" 866° £66" 
000°T 9666° 866° 

O00°T 8666° 8666" 8666" 666° 966° 
O00°T 666° 666° 866° 

O0OO°T 8666° 

$6666° $6666" ‘S666° 9866" 8966" 166° 
000°T 16666° 9666° 7666" (166° £66° 
000°T 6666" 8866° 966° 

000°T 68666° 

000°T 8666° 8666" 666° L66° 

000°T 666° 

000°T 866° 

9T St vl £T aL Il 


OOOT 8866" 286° 406° 722° 
000°T S66° 906° 


066° SUG" 2P60 1698" (SV2> 5 7cSS~ 
£66" 8£6° 8b6" S88" 991° SLS° 
266° 066° 726° 616° LTI8° O09" 
OOO0°T 66° 896° S68" 9b" 

O00°T 986° LZ8° 


816° TS6— 68 88ein S229 PoP 
186" 956° £06° £08° 729° 87h" 
166° 7L6° 826° vr8’ S69" Lr” 
866° OSGi O96 mec OOn Nyala OSs 
000°T S666° 966° 996° S88° <£0L° 
O00°T 666° 86° 948° 

000°T 866° 


£66" 086° 756° 768° Ll" 68S" 
466° 686° 996° LT6° ST8° 929° 
OOO'T 666° 066° 296° 788° ZEL° 
000°T 666° TL6° 998° 

000°T 966° 


866° 866° 846° vb6" 268° Ll 
OOO'T S66" S86° 496° 06" ELL" 
000°T 966° Lt6" £98" 

000°T 166° 


Q00°T 266° 266° 096° S98" 
000°T 7S6° 


Or 6 8 Z 9 Ss 


eddy auo fo su8ts 7/bd > u yIM pt413 b x d & UT su8Is 
SYXTLT FO syYDOTG FO LoOqunu oy. IOF suOT4JNqTIASTp sATAeTNUND 


L9T° Zv0° 
pLt* vb0° 
96T° TS0° 
6£2° S90° 
ztg* 160° 
6@b° LET" 
Leo wee 
OOO°T 6£6° It’ 
000°T 626° 
vIc* OT’ 
LES Se U 
98f° OPT” 
88b° O61" 
£89" L87° 
OOO°T 206° 6b" 
000°T L96° 
961° 0S0° 
SO Comets) Oe 
L£z* 90° 
TOg* 880° 
CLR: S&T: 
p09" 827° 
VUG Vor~ 
000'T pL6° 
Cve. Tet: 
z7sc* Tht” 
ply’ L8T° 
T£9°" S87" 
988° slb 
000°T 796° 
IC Cai ea Cae 
8SS° LZ" 
TL9° Lb’ 
Sp8° L7S* 
O00°T 626° 
79OL° Phy 
£¢8° BPs" 
000°T 688° 
v iS 
“@ 9T9eL 


£00° 
£00° 
bo0° 
900° 
600° 
Sto" 
870° 
850° 
£9T° 


ANMtNOrF DAS 
4 


o 
S 





asta SXh 


Tt0° 
z10° 
9T0° 
470° 
L£v0° 
980° 


002° =u 


ANMtTNOND@ 


OSB Xp 


£00° 
v00° 
§00° 
400° 
z10° 
$20° 
190° 
90s 


AMTNON DH 





aSBD OXE 


z10° 
STO" 
£70" 
Tro° 
980° 
O1z° 


ansnor 


=u 
aSBD GxE 


Tyo" 
Spo" 
690° 
LCDs 
857° 


ansrn0o 


=u 
asea pxe 
Ci 


v 
061° £ 
z 


See =U 





Zz ase Exe 





Applications 


1. 1 1A 





The Evaluation of 
Economic Policies 


Gary Fromm, The Brookings Institution 


"The Congress hereby declares that it is the continuing 
policy and responsibility of the Federal Government to use all 
practicable means consistent with its needs and obligations and 
other essential considerations of national policy... to pro- 
mote maximum employment, production, and purchasing power," 
(15 U.S.C. 1021). With the passage of this legislation, 
known as the "Employment Act of 1946," and notwithstanding 
nascent efforts in the application of Keynesian concepts in 
the 1930's, a new era began in the formulation, application 
and review of government policies that affect employment and 
incomes. As Paul Douglas observed on the twentieth annivers- 
ary of the Act, it probably is not accidental that the last 
two decades have not witnessed a depresssion and have seen 
enormous gains in economic prosperity and human happiness [1]. 

However, it was little appreciated twenty years ago that 
some of the stated objectives of the Act were inconsistent; 
for example, tradeoffs between maximum employment and purchas- 
ing power were ignored. Nor, were other important goals, such 
as economic growth, price stability, equilibrium in the bal- 
ance of payments, the efficient allocation of resources, a 
balanced distribution of income, and other related social ob- 
jectives specifically mentioned. These have all, at one time 
or another, been considered by the President, the Congress, the 
Joint Economic Committee and the Council of Economic Advisers 
in formulating and implementing policies under the Act. 

However, it has contributed significantly to an increasing 
literacy on economic matters on the part of policy makers in 


government, and also of the press and the public. Many useful 


356 Gary Fromm 


new concepts, including the full employment surplus, potential 
output, and fiscal drag, were developed and applied in deter- 
mining the need for changes in economic policies. Yet, even 
today, more than twenty years after the Council of Economic 
Advisers and the Joint Economic Committee began their efforts, 
the methods used to evaluate the impact of policy alterations 
can greatly be strengthened. 

Policies are adopted with insufficient analysis of their 
consequence on the structure and functioning of the economy 
and with ad hoc estimates of their quantitative effects. A 
good case in point is the wage-price guideposts. As originally 
formulated in the 1962 Economic Report, they neither would 
lead to the duplication of competitive conditions nor would 
they be neutral with respect to the distribution of real out- 
put, resource inputs, or factor income shares -- all stated 
objectives. The guideposts implicitly assumed that the elas- 
ticity of substitution of labor for capital was equal for all 
industries and, moreover, equal to unity. If it is less than 
unity, for which there is some, although conflicting evidence, 
then the capital share of industry income or product is favored 
by the guideposts. If the elasticity of substitution differs 
between industries, then the resource mix and cost implica- 
tions also differ. 

Perhaps another example of insufficient justification of a 
policy proposal is the recent income tax surcharge. The 1968 
report of the Council of Economic Advisers recommends a ten- 
percent surcharge, which was subsequently enacted, to be im- 
posed on income taxes of corporations from January 1, 1968 
through mid-1969; and on income taxes of individuals from 
April 1, 1968 through mid-1969. Nowhere in the Report does 
one find the rationale for the differing dates of imposition. 
Nor does the Report contain estimates of the quantitative im- 
pact of the surcharge on the total level of final demand or on 
prices. The discussion of impacts and the tradeoffs between a 


tax increase and tightening of credit conditions is cast solely 


The Evaluation of Economic Policies S150] 





in qualitative terms. Moreover, while most observers probably 

would agree on the need for a tax increase, its amount might be 
a matter of contention. The Council does not justify its sel- 

ection of the ten percent figure in constrast to any other. 

In this regard, two points are at issue. First, there is 
the question of determining the impact of alternative policies. 
Second, there is the question of evaluating the desirability of 
the alternatives. On the first point the Council has expressed 
the view that: 

Indeed, forecasting of some kind is indispensable. [How- 

ever] the limitations of the economists’ ability to predict 

the future argues for prudence in policy decisions, flex- 
ibility in the use of instruments, and continuing efforts 

to improve the reliability of forecasting techniques, (1968 

Report, pp. 88-9). 

Yet, the only measures it discusses for increasing predictive 
ability are efforts to improve the quantity and quality of 
economic data. Certainly, this would be beneficial. But, much 
more could, and should, be done. 

There is a disparity between the state of the art in the use 
of formal models to analyze the economy and to ascertain the 
effect of policy changes, and the techniques employed by the 
Council. The formulation and construction of econometric models 
have progressed greatly from the early days of simplistic and 
incomplete sets of a few linear equations. Today's models are 
far more powerful than the predecessors familiar to economists, 
the Klein-Goldberger [4], and Wharton models [5], of one or two 
decades ago. For example, the largest system at present, the 
Brookings model [2], has several hundred equations and includes 
an extensive description of financial, housing, and foreign sec- 
tor relationships. It also integrates into the fabric of the 
model an input-output structure that translates GNP demands 
into industry outputs and combines industry prices into GNP com- 
ponent prices. This model has been used in simulation experi- 
ments of fiscal and monetary policy shifts [3]. 

I do not mean to imply any criticism of past or present 


members of the Council. With its traditionally pressure-cooker 


358 Gary Fromm 





atmosphere, small staff, and limited budget, it is impossible 
for the Council to do the fundamental research necessary to 
build and validate models that meet its forecasting and policy- 
making needs. But, the time has come to break with tradition 
and give the Council the resources necessary to perform this 
research. The goal is to give its forecasts and recommenda- 
tions a sounder theoretical and empirical base. 

This is not to say that the Council should subjugate its de- 
cisions to a set of complete-system econometric models that de- 
scribe the structure of the economy for short-, intermediate-, 
and long-term analyses. The models are not yet sufficiently 
accurate to rely on their solutions to the exclusion of a priori 
reasoning and informed and experienced judgment. Nevertheless, 
the models provide useful and valid information (with relatively 
narrow confidence limits) and can assist in making the determin- 
ation of policy impacts more objiective. Thals) as especially weruc 
when the alternatives have nonlinear effects and are combina- 
tions of changes (in contrast to shifts in a single parameter or 
variable) which must be studied in a dynamic setting. 

Models also can be helpful in evaluating the desirability of 
the alternatives. The first step in that process is to obtain 
a detailed listing of impacts. For example, the effect of a 
policy on unemployment, wage rates, prices, profits, industry 
outputs, investment, government expenditures, housing starts, 
interest rates, and so forth, should be determined. The next 
step is to identify the groups of individuals, for instance, 
white or non-white, young or old, male or female, urban or rural, 
affected by these changes and to ascertain the extent to which 
their "welfare'' has been altered. Finally, it 1s necessary to 
weight these welfare or utility shifts to obtain an aggregate 
measure for inter-policy comparisons. : 

Presumably, under the system of government in the United 
States, the final selection of the policy to be pursued rests 
with the President and the Congress. However, when the Council 
only presents a single recommendation without alternatives, 


then the Congress is at a great disadvantage in deciding upon 


The Evaluation of Economic Policies 359 





the merits of the suggested course of action. 

The reluctance of the Council to present alternatives, in 
nature or degree, is understandable. The presentation of al- 
ternatives could lead to extended discussion on the net advant- 
ages of each, with the result that no action is taken. In 
other words, the strategy of attempting to obtain legislation 
is to pose the alternatives as a favorable state of affairs, 
or dire consequences. 

Whatever the outcome of this choice, the Council has implic- 
itly imposed its utility function on the tradeoffs between 
goals and impacts, or its perception of the nation's or the 
Congress' utility function, on the decision process. That is, 
the recommended action already contains judgments as to the 
relative utility of the various impacts which are predicted to 
occur. 

It probably is not appropriate, or desirable, to allow such 
great discretion to the Council without explicit consideration 
of the welfare implications of the choice. That is, notwith- 
standing the loss of tactical gains in getting legislation 
passed, it is incumbent on the Council to more fully set out the 
choices, present a detailed list of impacts, and show how vari- 
ous groups and interests will be affected by each alternative. 

Moreover, explicit consideration of utilities tradeoffs 
might also be helpful in the formulation and evaluation of pol- 
icies. In a path-breaking and celebrated article forty years 
ago, Frank Ramsey grappled with these problems [7]. The partic- 
ular issue that he tackled was, "How much of its income should 
a nation save?" While they did not enter Ramsey's analysis, in 
a more modern setting, income tax rates would certainly be a 
consideration in obtaining the optimum savings conditions. 

The Ramsey technique for deriving these conditions was to 
posit a generalized utility function and maximize utility sub- 
ject to budget and income generation constraints. 

To illustrate the method let: 

t= seme 


360 Gary Fromm 


C = consumption 
Y = income 
L = labor 
K = capital stock 
K = dK/dt = rate of growth in capital stock 
r = utility function of consumption 
Y = disutility function of labor 
P= total itislaty 
Then 
GQ) = J, ire) - ¥L)] ae. 


The objective is to maximize ® subject to the budget constraint: 
(2) Geir Sag: 

and the income generation constraint, 

(3) NG SEC, Ib) 


This is a problem in the calculus of variations with the func- 
tional given by F = F(L,K,K). It can be solved for optimum 
marginal conditions for returns to labor and capital and the 
time paths of investment and capital stock. 

There are several difficulties with applying this method 
directly in the analysis of policies for cyclical stabiliza- 
tion. First, while the utility function is general, it does 
not include many of the variables, such as income distribution, 
unemployment, prices, government expenditures and deficits, 
that may be of concern to society. Second, no provision is 
made for accounting for the complementarity of utilities (in 
equation (1) they are additive). Third, account is not taken 
of the disutility of the time-path variances of the arguments 
in the utility function (that is, of the cyclical paths of C 
and L). Finally, the income generation constraint is general. 
When realistic income generation constraints of a dynamic non- 
linear nature are substituted for the general constraint, the 
derivation of decision rules and optimum conditions becomes 


almost hopelessly complicated. 


The Evaluation of Economic Policies 361 





Henri Theil [8] also has tackled the problem of devising op- 
timal strategies for economic policies. His approach is to as- 
sume that the utility function is quadratic and the constraints 
annem lEinedal aE Orme xampilens sl eles 

t) = "tines=" in 

x = [x (t)], h = 1, m, be a vector of instrument (ex- 
ogenous) variables such as government expenditures 

y = ly; (t) 1, 1 = 1, n, be a vector of noncontrolled 
(endogenous) variables such as consumption or in- 
vestment 


R = a matrix, of order nT x mT, of fixed elements de- 
scribing the multiplicative constraints 


S = [S;(t)], i = 1, n, a vector of fixed elements de- 
scribing additive constraints 

@ = utility 

a 


,b,A,B,C = vectors and matrices of fixed elements and 
of appropriate order, A and B being sym- 
metric. 


Then, the preference (utility) function, 

(4) D(eeayy) = vale oe bitty) al /ZiGactAnes: yAlByr se ey yaiGx) 
is maximized subject to the linear constraints: 

(5) Vary ES 


While this approach does not suffer from the complementarity 
(covariation) and time variation difficulties of the Ramsey 
method, it has two serious drawbacks. First, while the assump- 
tion that the preference function is quadratic is attractive 
because it greatly simplifies the mathematics (the derivatives 
are linear combinations of the variables) it is a strong one 
and cannot be justified on a priori grounds. Second, frequently 
it is not possible to transform the estimated equations describ- 
ing the structure of the economy into a reduced form of linear 
constraints. 

Because of the difficulties of the Ramsey and Theil tech- 
niques, another approach that suggests itself is to employ trial 
and-error methods to obtain a mapping of the utility possibility 


362 Gary Fromm 


frontier. That is, given a model of the economy and the speci- 
fication of the social welfare (utility) function, it is pos- 
sible to derive, via repeated solution of the model (i.e., sim- 
ulation), the utility of alternative policy actions. 

Fromm and Taubman [3] have recently applied this technique 
for evaluation of the relative desirability of a set of mone- 
tary and fiscal policy actions. Three different utility func- 
tions were used: 


(6) Linear te EB 5X5 : 
(7) Cobb-Douglas eo Mx; : 

4 6,1/6 
(8) CES are [e8;x3] 


where u is utility which depends on the variables (termed argu- 
ments) Xi» i = 1,n. These arguments would be non-controllable 
and instrument variables such as consumption, investment, un- 
employment, prices, and government expenditures. The 8 and 6 
coefficients are fixed parameters. 

A further adjustment to these standard equations is required 
because of the dynamic response of the economy to fiscal and 
monetary policy changes. These lag responses are different for 
each policy and vary over (simulation) time. Therefore, one 
cannot select the "best" policy for period one and then the 
"best" policy for period two because the solution for the later 
period depends on the earlier impacts. Thus, it becomes nec- 
essary to determine the utility value of the entire path for 
each policy. Assuming that certain homogeneity, stationarity, 
and independence conditions hold [3], then aggregate utility 


is the discounted sum of all future utilities, or: 


20 tL. (Co) 
(9) HAG) = Ses 
0 La (i+r)° 


r is a positive rate of time preference; it need not be con- 
stant. 
However, this approach cannot be applied directly to the 


determination of the utility of a cyclical path which is not 


The Evaluation of Economic Policies 363 


characterized by equi-proportional growth. The lack of such 
growth over short periods makes it necessary to include a var- 
iance-covariance matrix of the arguments in the multiperiod 
ut@laty function. This is required because, for example, the 
consumer would probably prefer an equal interperiod rate of 
change in the supply of bread to one that fluctuated wildly or 
one that was not matched by an equal availability of butter. 
Thus, the CES function might be modified to: 


m n 1/6 
fap ea Ceeaion 8 += xi]. 


Ty 


t=1 (1+r)* 
where n is the length of the period being analyzed. This is a 
complicated expression. To simplify the analysis, in many in- 
stances merely accounting for the disutility of the variances 
might suffice. This could be done by using as elements in the 
utility function the reciprocal of the variances of the xX; ar- 
guments (1/v;), that is, utility increases as the variances de- 


crease. In the Cobb-Douglas case, for instance: 


ee ay a 
(11) We = Il Vy 2 


The 8's here need not necessarily be the same values (implying 
relative weights) as those on the corresponding x,;'s. Yet, 
for the sake of simplicity and in the absence of any other in- 
formation or a priori estimates, this seems to be a reasonable 
assumption. 

Just as in behavioral models of individuals' expected value- 
risk aversion decisions, there are tradeoffs between the levels 
of a set of outcomes and their variances [6]. These tradeoffs 
might take the form of the usual convex indifference surface, 
or other complex, nonlinear functions. Thus, there are many 
possibilities for defining the total utility of a cyclical 
path as a combination of outcome and variance utilities. In 
the Fromm-Taubman simulations and utility calculations, which 
were only illustrative of techniques that might be used, total 
utility was taken as the sum of the two components, that is: 


364 Gary Fromm 





(12) u 


+ 


T = 745 


where the a's are relative weights. 


a,U 
Cent 


A final step is required before undertaking utility computa- 
tions. Scaling of the arguments is necessary so that zero de- 
viations between policy and control solution results do not pro- 
duce zero utility. This can be accomplished by calculating 
utilities based on the ratio of the variables rather than their 
deviations. Similarly, for variance utilities, a modified coef- 
ficient of variation can be used as arguments so that zero var- 
iances yield maximum utility and infinite variances, zero util- 
ty sSUChalcOe teal CaemitmSi. 

sieeeea 


> 


(1055) Cx. = == 
oa 
where c’ is the modified coefficient of variation, $5 is, the 
standard deviation of an argument x5 about its mean Gr-veoe 
and the mean of X5 is Xs. When s,=0, cx,=1; when ey Cx4=™- 
Therefore, the arguments in the variance utility function are 
the reciprocals of the Cx;° For example, the Cobb-Douglas var- 
jance utility function reads: 
(Otel ee) 

x: 

Using these concepts, the utility rankings of fourteen, ex- 
pansionary fiscal and monetary policy actions were calculated. 
The arguments in the functions were: real personal consumption 
expenditures; real gross private domestic investment; real 
government expenditures; the reciprocal of the rate of unemploy- 
ment; the current dollar government surplus and the reciprocal 
of the implicit price deflator for GNP. Various sets of weights 
and functional forms were employed. Rankings were also calcu- 
lated based on traditional multiplier concepts such as the addi- 
tional dollars of real GNP generated per dollar of actual or 
equivalent real resource inputs. (Equivalent inputs are derived 
by translating changes in exogenous monetary parameters or var- 


iables into shifts in the constant terms of real expenditure 


The Evaluation of Economic Policies 365 


functions [3].) All rankings were computed on a discounted 
basis to account for the time shape of response patterns. An 
illustrative set of results is shown in Table 1. 

The details of these simulations need not concern us here. 
The important point is that the rank ordering of the tradition- 
al measure, multipliers, differs significantly from that ob- 
tained from using a reasonable a priori utility function. The 
rankings of different utility functions, using the same weights 
and arguments, also differ significantly, depending upon the 
elasticity of substitution (that is, the degree of substituta- 
bility between the gain in one argument for that of another). 

In other words, the ranking of policies depends critically 
on how their effects are viewed. If total output (real GNP) is 
the criterion, one ranking results; if the utility of a detailed 
set of impacts is the criterion, other rankings are obtained. 

At this time, I would not advocate the rigorous application 
of utility functions for the formulation and evaluation of 
policies. Nevertheless, employing them in limited fashion, 
especially when a range of arguments and weights are used, is 
helpful in acquiring perspective on the relative desirability 
of alternative policies. 

Unfortunately, little work has been done to ascertain the 
form, variables, or weights of such functions. (Charles C. 
Holt in a 1968 proposal to the National Science Foundation for 
research support of a study entitled "Quantitative Estimation 
of Economic Policy Objectives" lists several books and articles 
of related interest.) While it may never be possible to do so 
definitively, interviews and simulation experiments could pro- 
vide insights into which variables are most important and the 
likely range of weights and degree of substitution. One tech- 
nique that can readily be implemented is to use econometric 
models in computerized simulation games asking individuals to 
choose amounts and mixes of instruments to realize what they 
conceive as national economic objectives. For example, initial 


conditions of a recession or an inflation might be given; then, 


eae eee eee eee eee ee 


cE 9 
Il L 
OL S 
6 =i 
8 ae 
9 OL 
i 8 
d 6 
£T iT 
vt vI 
£ T 
S v 
6 c 
v £ 
uoT}OUNF seTsnoqg-qqo) dN) [Bol 10F 


yitm AYTTTIN [e103 
po UNOoosStTp FO yuey 


ee 


lettdrztnu yndut [eer 
pejunodstp Fo yuey 


suote ssed ju9d 
suote ssed juss 
Suote ssed judd 

(e81eT) 3nd 


Suozte ssed judd 
suote ssed juss 
suozte ssed usd 

(1Teus) 3nd 


aod 
aod 
ied 
xe 


i1od 
aod 
sod 
xe 


OS 
08 


OOT 
OSTIXY 


OS 
08 


OOT 
eSTOX, 


suotjei1odo yoyrew uodo 


uoTJINpys s}yuoweItnboer SALOS OY 


Aottod Airezouow snpd 4nd xe. owodsuT 


3nd xb} DWODUT 


uoT}JONIZSUOD 
quowAoT dug 
SOT qeinpuon 

soTqeing 

ainjtpuedxe JUSWUISAOS UT 9SeoL5ZUT 


De ee ee — EE 


[¢] ‘(s2eqi1enb uaq 19,Fe poyuer) 
Sutzepio yuer AZI[TInN *sA Iat{dtiTNu Fo uostaeduog 


aoe 


ADT [Od 





The Evaluation of Economic Policies 367 


how an individual simulated or suppressed demand -- wage, price, 
employment, expenditure mixes -- would be indicative of his 
preference function. This approach is akin to the studies of 
consumers expenditure patterns in order to determine their im- 
plicit preference structures. (For a survey of this work, see 
Arthur S. Goldberger, "Functional Form and Utility: A Review 
of Consumer Demand Theory," Systems Formulation, Methodology, 
and Policy Workshop Paper 6703, Social Systems Research Insti- 
tute, University of Wisconsin (mimeographed) , October, 1967.) 

For some reason, there are many economists who scoff at the 
idea of ever being able to discover and empirically verify 
social utility functions. They should be reminded that such 
functions are implicit in all government policy decisions. 
Blindly ignoring utility tradeoffs can only lead to failure to 
attain optimum welfare. And if John Dewey is correct that: 
"Every great advance in science has issued from a new audacity 
of imagination,"' then the attempt to explore utility space 


can only lead to progress and improved economic decision-making. 


Bibliography 


1. Douglas, Paul H. Twentieth Anniversary of the Employment 
Act of 1946: An Economic Symposium. U.S. Joint Economic 
Committee, February, 1 : 


2. Duesenberry, J.S., Fromm, Gary, Klein, L.R., and Kuh, E. 
The Brookings Quarterly Econometric Model of the United 
States. Chicago: North-Holland-Rand McNally, 1965. 


3. Fromm, Gary, and Taubman, Paul. Policy Simulations with an 
Econometric Model. Washington, D.C.: The Brookings Insti- 
tution Lo Gish 

4. Klein, L., and Goldberger, A.S. An Econometric Model of the 


United States, 1929-1952. Amsterdam: North-Holland Pub- 
GIS iit al COP MLO)SIS ie 


5. Klein, L.R., and Popkin, J. “An Econometric Analysis of the 
Post-War Relationship Between Inventory Fluctuations and 
Changes in Aggregate Economic Activity," Inventory Fluctua- 
tion and Economic Stability, pt. III, Joint Economic Com- 
mittee (87th Cong., 2nd sess.) Washington, D.C.: Govern- 


Nemtshrimtcunc Oft1ce 196k 69589. 


368 Gary Fromm 





6. Markowitz, Harry. ‘''Portfolio Selection,” Journal of Fin- 
ancel,, (March 1952005) 77-9. 
7. Ramsey, F.P. "A Mathematical Theory of Saving," Economic 


Journal, (December, 1928), 543-559. 


8. Theil, H. Optimal Decision Rules for Government and Indus- 
try. Chicago: Rand McNally & Co., 1964. 


Non-Linear Econometric 


Models 


Michael K. Evans, University of Pennsylvania 


Background 


Macroeconometrics has recently come of age. In the past 
few years, econometricians have estimated and solved several 
realistic medium and large scale macro models for a wide var- 
ilety of countries. Many of these models have demonstrated an 
ability to track the economy accurately both during historical 
periods and for true ex ante forecasts, and have generated real- 
istic multiplier calculations for a large number of policy var- 
iables. Some of these results are already well known, and it 
is not my intention to review them here. Instead, I will focus 
on the problems raised by the nonlinearities inherent in these 
models; the methods of solution and the implications for policy 
analysis. Since the focus of this conference is on computer 
simulation, I think it is appropriate to include a short review 
section on the current methods of computer solution for non- 
linear models. The main thrust of this paper, however, will 
be an exploration of the differences in multiplier values at 
various stages of economic activity. In particular, this paper 
examines the effects of changes in government spending and tax 
rates both at levels of full capacity utilization and consider- 
able economic slack. 

Although virtually all econometric models currently being 
used contain some nonlinearities, this aspect of multiplier 
analysis has received very scant attention. The only model 
for which different multiplier estimates are given at differ- 
ent levels of economic activity is one for the Dutch economy 
[12]. However, these results are somewhat suspect because of 


the negative multipliers for government expenditures. One 


370 Michael K. Evans 


ambitious project would be to take a representative sampling of 
all recent models and calculate the multipliers at high and low 
levels of economic activity for each of them. This has never 
been attempted, although there have been several tabulations of 
the functions of these models and the models themselves. For a 
comparison of individual consumption and investment functions, 
see (9, Ch. 9]. A tabular comparison is given in [10)>) Alpraee 
multiplier comparison is given in [1]; this analysis is extended 
considerably in [8]. In view of the extreme difficulties of 
exact reproduction of the data and computer programs needed to 
perform all these multiplier calculations, the comparison in 
this paper is limited to models of the United States [6], France 
[2], and Israel [3], each constructed entirely or in part by the 
author. While this is obviously not a comprehensive list, it 
does represent a rather broad cross-section of the types of 
countries for which it is feasible to construct macroeconometric 
models. Inasmuch as the models were all constructed by the same 
author, there will be some similarities, and even some monotony, 
among the various comparisons. On the other hand, since the 
models do contain these basic similarities, differences in mul- 
tipliers which are found are much more likely to represent true 
differences than those arising merely because different authors 
have different philosophies of model specification. 

We now briefly review some of the methods of non-linear solu- 
tion which have been used lately. Until quite recently, the 
Wharton-EFU model, among others, was solved by Newton's method. 
To adapt this to the particular problem at hand, consider the 


general system of nonlinear equations 
Fi yeee oYnt Xt °*  Xmt? =Vers d= Ly 2iecsens 


which can be approximated linearly by 





B (0) (o))., 
F; ze F504 gene aay sXy g++ Xing) 
A on (ya,-¥$?) 
Ls oy. yay 6?) J J 
ea) ee 
X=X 


Non-Linear Econometric Models 57a 


Solving this equation system for the vector of estimates of the 
yi (call it ¥,) we have 


Aa n oF. -l 
of i (0) (O)) . 
Yt = ie dy y=y 60) (F; (yy Ea is Xa gece Xe) 
aa jt|x= 
t 
n OF. (o) 
tOiese ny Chandi 
Jee ye X=X 
t 


(0) 


The values of the Yq Rete are just guesses, which might 
be previous values of the y,'s- The degree of approximation 
can be improved if we continue to solve this system by itera- 
tion. Intparticuillan, we can think of the’ vector ve obtained 


above as just the first round of estimates, and denote them as 














yo. If we then use these estimates to solve the equation 
system a second time, we have 
oF. =a 
SCs | y i ey) Coy 
he do 5: yay) (Ea gs aenpae ss 
X=X 
it 
n oF. (1) 
i Cites 
ae? ee i - oy. vy J 
Jeo Jt) x =x 
it 
and, continuing this process, 
Aa OF. -1 
Cy i (Ges) (ase) 
es Ls ay 54 yey Oe CG ee ances eae 
X=X 
it 
n dF (Ge=ab)) 
i Oe ); 
Slut? Xmt)) i jy. yay ‘t Oe 
ar Jt) x=x 
t 


until we reach a preassigned degree of precision for each Yu 
Croan, 
Ge Ac) 
Tw 
Pot 


< 0.001 GL 2 A aca 


This method will give an exact answer if the non-linearities 
of the model are all in simple multiplicative form. To illus- 


See Michael K. Evans 





trate this point, consider a model which can be separated into 
ay \preacer (including wage) and "quantity" block, i.e., all 
equations can be usefully classified as belonging in one block 
or the other. If nonlinearities arise only in the form pq or 
p/q, then the derivatives of each of the terms are linear ex- 
pressions in one of the two blocks. Since there is no approx- 
imation involved at this stage of the calculation, the degree 
of precision will be determined only by the convergence criter- 
ion. Since the matrix of partial derivatives with respect to 
prices will all be linear functions of quantities, and analo- 
gously for the other block, we are able to solve the system of 
equations exactly. 

The convenient feature of exactness disappears, however, if 
the model contains certain other kinds of nonlinearities, such 
as those found in logarithmic production functions. In that 
case, for example, it as) nécessany tol repliacemilog Wis by 

log Dera log Vie tee 

Baul 

Similar problems could arise if some of the equations contain 
nonlinear terms in capacity utilization or unemployment, such 
as might appear in price or wage functions. The buildup of 
error due to these additional approximations could be substant- 
ial in multi-sector models or where long simulations are cal- 
culated. In practice, moreover, this method has proved cumber- 
some to program and relatively slow for computational purposes, 
since it involves inversion of 50 x 50 or even 100 x 100 matrices 

Because of these drawbacks of the Newton method, most econ- 
ometric models are currently solved by the Gauss-Seidel method. 
While at least the first part of the name would indicate that 
this is not a particularly new method either, it has only quite 
recently been adapted to solving econometric models.* In a very 
broad sense, this method is an extensive of Newton's method and 
the division of the model into blocks. However, in the Gauss- 
Seidel method, each equation is treated as a separate block. 
This omits the need to solve multi-equation systems and thus 


to take partial derivatives. Each equation, no matter how 


Non-Linear Econometric Models SiS 





complex, can be solved exactly as long as it can be expressed 
algebraically. Starting estimates are guessed (as in the New- 
ton method) and these estimates are used as values for the de- 
pendent variables on the right-hand side of each equation. 
When the first iteration has been completed, a new set of all 
simultaneous variables is available. These values are then 
used to re-solve the system one equation at a time. This pro- 
cess again continues until a preassigned degree of precision 
is reached. Algebraically, this method of solution can be ex- 
pressed as follows: 

Let the jth equation be expressed by 


vos Lear Vite nt Xt mnt) 


u 
i 
. 
. 
3 


We first evaluate 


oo Cont : 
Vit on Eas YE OGD one peste o%ne) 


and continue to iterate until we obtain 


a ee Cra Petr 2) (r=1) (red). 
Yjt £Oyy+ Yat eas qt 22°99 Ynet Xi 4? +X ie) 
j = Les Nn 
such that 
ail 
yon? Z yht ) 
Sao < 0.001 


Vit 
It would be possible to substitute values already calculated 
in the rth iteration for Vigot = 5-1 st when solving for Vit: 
While this method reduces the length of time needed for compu- 
tation, it may lead to a divergent solution for certain order- 
ings of the equations. In practice this method has been found 
to be approximately 10 times as fast as the Newton method for 
those models which have been solved using both methods. It is 


the method used for calculating all results reported in this 


paper. 


374 Michael K. Evans 


OL 


Nonlinear Multiplier Analysis 


The widespread acceptance of the Hicksian IS-LM diagram and 
further advances in the elementary theoretical macroeconomic 
model, coupled with the Phillips curve analysis, have led us 
to expect smaller values of the multiplier near full employment 
for the following four reasons: 


(1) As the level of economic activity increases, idle cash 
balances are reduced and the interest rate rises at an 
increasing rate, thus reducing investment. 


(2) As prices rise, income is redistributed away from wage- 
earners toward profit recipients, who have a lower mar- 
ginal propensity to spend, and toward the government, 
which has a zero marginal propensity to spend. 


(3) As prices rise, the real value of money and government 
bonds falls, so that the consumption/income ratio de- 
EilsinSsS ¢ 


(4) As prices rise, the net foreign balance declines as 
imports increase and exports decrease. 


All of these reasons, with the possible exception of the first, 
depend on prices rising at an increasing rate near full capac- 
ity. This in turn may come about either because wages rise at 
an increasing rate or because margins between prices and vari- 
able costs increase, or both. While there is strong evidence 
that the usual nonlinear Phillips curve occurs in all three 
countries examined in this study, the price responses are not 
quite the same, which does make some difference. 

The first reason listed above, although often thought to be 
the most important, has a very minor role in all of the models 
considered here. Part of the reason is the complete lack of 
any empirical evidence for a liquidity trap in the postwar per- 
iod. However, that argument alone does not preclude nonlinear- 
ities in the LIM curve near fulll capacity. Tt is still enue seman 
expansionary fiscal policy which is coupled with an unchanged 
money stock may result in little or no change in economic ac- 
tivity if there were previously no idle cash balances. However, 
these expansionary fiscal developments are almost always fin- 
anced with the acquiescence if not the help of the Treasury, 
which attempts to keep monetary stringency at the same level 


Non-Linear Econometric Models Ses 





it was previously. The monetary crunch of late 1966 is some- 
times cited as an example of independent action on the part of 
the Fed. However, in this case the Fed acted to dampen infla- 
tion only when it became clear that the tax surcharge preferred 
by the President would not be forthcoming soon. Only if prices 
start to rise more rapidly, thus driving up velocity at an in- 
creasing rate, will there be a decrease in the multiplier oper- 
ating through the interest rate. In the U.S. model, interest 
rates are controlled primarily by exogenous policy instruments: 
the discount rate and the free reserve ratio. It is assumed 
that these remain unchanged in the multiplier analysis which 
follows. There is no explicit monetary sector for the French 
and Israeli models. Velocity does enter into the investment 
EUMCEMONS Me Xp ICH tly, —ibUte its seLhecesls Not danger 

The effect of prices on income distribution is primarily 
responsible for the differences in multipliers for these models. 
In the U.S. the story is a familiar one; when prices rise, 
corporate saving and taxes rise as a percentage of GNP, thus 
reducing personal disposable income and consumption. The same 
general result is reversed, however, for the French and Israeli 
models. This requires a word of explanation. 

In both the U.S. and French cases, there is evidence that 
the spread between prices and unit labor costs increases near 
full employment, besides the increase in unit labor costs them- 
selves. However, there is no such evidence that this occurs in 
the Israeli economy. Thus wages and prices increase by the same 
percent, and wageearners may not be worse off. Furthermore, if 
no devaluation occurs in the short run (which is assumed to be 
the case in these simulations), imported goods will become 
cheaper, so that the real wages of workers may rise for a while. 
In addition, the price increase of services provided by the 
government may rise slowly or may lag substantially. In Israel 
this includes most rent, medical, legal, educational, utilities 
and transportation services in addition to the usual services 


provided by government. Furthermore, the Israeli tax structure 


376 Michael K. Evans 
eee 
is based heavily on excise taxes and import duties and as such 
is slightly regressive: a 1% increase in money income will 
result in about a 0.95% increase in tax receipts. When all 
these facts are combined, an increase in economic activity 
actually raises the value of the multiplier in real terms. 
Wages rise faster than the overall consumer price index, so 
that real personal disposable income and consumption increase 
more as full employment is approached. We might consider this 
as a "'cost-plus'' case of the IS-LM diagram which results in an 
upward sloping IS curve. As GNP rises, real income is redis- 
tributed toward wageearners, which thus raises the C/GNP ratio. 
This in turn would clearly result in a larger multiplier, as 


shown in Figure 1. 


: LM i LM 
is' 
1s 

Is' 
Is 
Sa x x x 
DEMAND- PULL CASE COST-—PUSH CASE 


Fapune si 


The other two effects are not too important in the models 
analyzed here. There is some evidence of a small Pigou effect 
in France, where inflation has been particularly severe, but 
none in the other countries. The net foreign balance price e- 
lasticities are often substantial, so that the last effect has 
some effect. However, it is not enough to override the impor- 
tance of the effects of income distribution. 

Our analysis so far has pertained exclusively to the effect 


of changes in fiscal policies on output. Effects on prices, 


Non-Linear Econometric Models Sith 





net foreign balance, unemployment, and the rate of growth (as 
measured by changes in investment) are also quite important. 
These facets of economic activity are also catalogued with the 
more familiar multiplier calculations. As would be expected 
from what has already been said about the Phillips curve, 
changes in prices are likely to be quite different for differ- 
ent levels of capacity and unemployment. These multiplier cal- 
culations are tabulated after a brief outline of the properties 
of the models. 


Capsule Overview of the Models 


It would be neither desirable nor worthwhile to list all of 
the equations in all the models, particularly since they have 
appeared elsewhere. What follows here is just enough to give 
a general flavor of each model. 

The version of the Wharton-EFU model used here is the latest 
in a continuing line of quarterly models which have been used 
to make genuine ex ante forecasts on a continuing basis Samce 
1963. This latest version has been reestimated through 1967.4 
and incorporates the data revisions of July 1968. This model 
contains 51 stochastic equations and is essentially a two sec- 
tor model. Sectoral disaggregation is undertaken for fixed 
business investment, inventory investment depreciation, pro- 
duction functions, wages, prices, and hours, for the manufac- 
turing and non-manufacturing sectors. The government and farm 
sectors are each treated separately but are totally exogenous. 
The most detail is found in the investment sector, which con- 
tains six functions. Imports are disaggregated into food, raw 
materials and semimanufactures, and all others, but exports 
are estimated in a single equation. The price deflator for 
the manufacturing sector is a function of unit labor cost and 
capacity utilization; the deflator for the non-manufacturing 
sector is determined as a residual. Prices for the components 


of aggregate demand are also a function of unit labor cost and 


378 Michael K. Evans 
ee eee 


linear and non-linear capacity utilization. These equations 
are estimated only from 1954 to the present because of the 
existence of price controls during the Korean War. A modest- 
size monetary sector of six equations explains the components 
of the money supply and interest rate on short-term commercial 
paper, long term bonds, and time deposits. 

The French model is an annual model, estimated from 1952 
to 1965. Almost all variables are formulated in percentage 
rates of change, since the model was designed primarily for 
short term forecasting use at 0.E.C.D. This model is somewhat 
smaller in size than the Wharton-EFU model, and contains only 
34 stochastic equations. The principal differences are that 
it is only a one-sector model, which eliminates seven equa- 
tions, and has no monetary sector, which eliminates six more. 
The foreign sector is considerably larger, containing three 
import and four export equations. Prices depend primarily on 
unit labor cost and import prices. Only in the capital goods 
sector is capacity utilization important, and there it appears 
in a nonlinear form. 

The Israeli model is the largest of the three with 79 sto- 
chastic equations, and is a seven-sector model (agriculture, 
manufacturing, construction, transportation and communications, 
trade and services, housing, and government). While the first 
and last sectors contain some elements of exogeneity, they do 
include endogenous production, price, and wage functions. The 
export and import sectors are greatly expanded; there are a 
dozen or more equations for each. Both goods and services 
(although not capital movements) are estimated separately. 
This model is also estimated annually from 1952 to 1965. Vari- 
ables are expressed in levels instead of first differences, 
except for the price and wage equations, where the percentage 
change formulation is used. Further salient characteristics 


for all three models are given in Table 1. 


SODTAIOS 
*‘spoo8 1auns 


uot oUNy -uod 4 199np 
QUOMJSOAUT 9YTT ud -O1d SnotaBa 
-dtnbo trqtdes so sqr0duy :UMOpYROIG 
‘sootad oAT{eTOI ‘ouwoduy posBey sys0qs ATuO poertteiad ZI I 
SODTA 
SOTQETAIBA SNOTIBA -10s ‘spoo3 
sootad oatqetaa ‘yndano “reek T 02 0 1ayqo ‘pooy ic d 
tayao ‘searing 
(ATUO -deJnueUu-Tuds 
spoo8 xzounsuod) BRT h sTRT190qeU 
saotad oatqepar ‘ouoouy pejinqtaystp yokoy med ‘poog Gs sn sqioduy 
SaTes :SaT10y 
-uaaut ‘zo08pnq quauwusoao8 suteq 49075 
“uotqyetndod ‘awoout :8utTsnoly qdooxe s8et on mi Zz I 
S1apio poTTtjun ‘sores 
TSOTLOJUSAUT ‘S1OPIO poTTTF (3eT peang 
-un *AqTIOTOA *3a3pnq udu -T13STp Ou) sae9t 
-u19ro8' fuotqetndod :8utsnoy 7/1 [T-0 worz satseA xt z A 
silopio poTTtTzun saT10OVUdAUT 
“sates :SaTlOUsAUT uo 8eT 1aqaenb 
“soqei ysotoqut ‘sootad Zz 4 1 {8utsnoy quo sn(py qudWA SdAUT 
DATIBTAI ‘awosut :8utrsnoy uo 3eyT 19q1eNb-¢ 4203S ¢ sn 1910 
10}90S Butangoejnuew AYTIOTAA 
ut ATuo QuROTJTUaTS ‘yo8pnq zuowdo,ToAap 
W19qy yoo3s TRqatde9 quawurda0s ‘4yndqno ze0k T 4 0 ii b I 
(ATuo quaudtnbs) si9sp10 
PeTttgun 4 saotad yD0 3s (3et parang 
*(ATuo uot IIN1ASUOo5) -T1}STp ou) siPdc 
MOT} Used ‘qnding Z7/T T-0 wory sataRA rN Zz J 
quoUgsoAuy 
ATuo 10390s Buting yooqs pTeqtdes quowgsn(py ssoutsng 
-oeynuel UT MOTF YseD faqer qsoroqutr ‘yndino s8rq] uowrTy yoo04s ¢ sn poxty 
saotad asatqepor 
‘auosut atqesodstp 
Teuosaod ‘squoudked eqrde9) 
Laysuety usto10y te tad 9 I 
Sey poqinqrsastp 
sie ut quowdAoTdwaun ti yoAoy 1P9d-T aqyesois38y € d 
Sie ut SUOTITpUOd. sootad 
Atpeazs ‘quawdAoTduoun SATIeTAI ‘auwooUuT Bert paqinqrsistp 
“sotqeanp ut wWataq yo025 atqesodstp [eRuosi9q yoAoy at0qaenb-p a1eso1838y ¢ sn uotzdunsuo9 
a EE EEE EE aS 
saTqrtie, suotqenbg suotqenby 
squamMoy 19420 quapuedaq tedroutag ainqoniis 8e7 go addy JO “ON Tapon 103995 


ee ee ee 


STOPOW oy} FO uostTiedwoy IeTNGe] "Tt 9TqQeL 


uotyeztttan uoTIeZITI4N 
Aqroedes reaurpuon Aqroeded ‘43s09 roqgeT atun 
satTqeinp 


iounsuod ‘jzuaewAoTdwaun peraqs 
-t8aa ‘s1api1o partttyun ‘uotz 
-etndod ‘jzuowXkoTdweun Teo] 


quewdoTduwoun 
*qgndjno ut o8ueys ‘ynd3qno 


quowdo,dusun/T 
saotad 1aunsuo) 


seotaid 
Ioumsuod {sotauedea 
ssoetT quawdoTduaup 


poeweper1105 AT39AT} 
-e3au 3381 a8em passeqT 


payepar10> ATaaty uoTJeZITI4n 
-eRau a3ei o8em passe Aqtoedes rrauttuou ‘sadtig 


aut, *yD0ys Teqtdes ‘yndqno 


uotzez 
-TTtTan Aqtoedes ‘qzuswAoTduaun 
aut. *yD0as Teqrdes ‘3nd3no 


(AqTuo 8utinjoejn 
-uell) UOTJeZTTI3N AZtIdeded 
“aut. *yo03s Teqtdes ‘ynd3zno 


saotid aynqrtqsqns ‘owosuy 


awodut 4 apetz pT1I0M 
“saotad aatqetaa ‘3ndqno 


si1ap4o0 patttyun 
uotzezt{tyn Ayrtoedes 
sootid satzeTor ‘4ndjno 
PTIOM 10 YoyYIeW UOoWWOD 


sootad 
BATIBTII ‘apeiz pT1o0M 





SOTQETIPA 


squamuoj 19430 quepuadeq yTedtoutsg 


sasueys 
e8equesied AyTaieay 


szsoo atqe 
-TIBA I9A0 
Bur.oy1eW 


Aqytoedes wnutxew 
‘s1apio petTttyun 


UON 


*quowAoTdwaun 
poareystsay 


siapi0 


PeTTtzun ‘oao10;7 


aes auo 02 dn 
s3e~t poinqtiistqd 


1e9k 
auo posset sotqet 
-I8A Juapuadap TTY 


saotad uo 
Bey aeak 7/T ‘quow 
-Aotdwoun uo 3eT on 


posser sorTqeiiea 
quopusdop TTy 


read 
T passer AtTasoW 


SOTQETIPA 

snotiea ‘1eak T-0 
BET 

pejnqtiystp yoAoy 


doqet ‘sinoy 
“Aqtoed 
-B) wnwtxey 


eAInND 
sdttttud 


setsnoq-qqo9 


$102 
-2eF TeITTIAD 
10F poqysn(pe 
“seT8noq-qqo9 


suotjenbo 
aotad sniqt9 


SOdTAIOS 
“spoo’ pain} 
-dejnuew 4 
ainqpnotsse 
peTteraq 


SOOTAIAS 

rauzoO ‘ust 
-Ino} ‘spoo3 
1ay.o ‘poo; 





ainzonizs 3e7 





(2,405) T aTqey 


suotjzenbq 
jo addy 


0 
z 
0 
b 
9 
9 
T 
z 
) 
T 
z 
c+ 
oT 
b 
T 
suotjenby 
JO *ON 


sn 


sn 


sn 


sn 


sn 


T2Pon 


sadtld 
1039aS 


(s1api9 
Burpnqtour) 
Atddnsg 19410 


sadem 


uotqoung 
uot onpoig 


sqi0dxq 


1039095 


sootaid 
spoos ,eqytded ut uotiez 
-T[Ttan Aqytoededs 1eaUT[TUON 


IpoutT[TuoU puke IedUTT 
y20q uotTyeztTr13n Ayroede) 


yenptser e ut 
awodut a8emuou [Te ‘{ATaqe 
-1edas paqzeoijz OU SsyIzoig 


yooqys Aauow ‘satqet1ea 
Aotptod aarssoy Te1spey 


dN ‘S0qe1 YSaI10qUT 


ser 
peanqrzystp yxoAoy 


dN) JO a8e19ae 
Taqienb 07 ‘eT 
peingtiystp yxoAoy 


seotid spoo8 [eq 


-tded ut 3eT poqnqts. 
sootad -stp ydeoxe ‘sadueyo 


qaodut ‘saotid 103995 


asequasiad AtTieayx 
posse, you 3s09 


saotad roqet ytun ‘a1eak 72/T 
qaoduwt ‘sod roqeT 4tup pessez adtad qaoduy 


uoTzeZITIIN 
Aqtoedes ‘3s05 1oqeT 3Tun 


quowAoTdwaun 
“uotjetndod ‘awoouy 


uotzetndod ‘sauwoouy 


quowAoTduaun ‘awosuy 


yooqys Teqrde9 


seotid 
yooys ‘saotad ‘yndqno 


sqztjoird ‘saotad 
*qgndqno ‘yD03s Teqytde9 


puewap 
aqesai83e Fo szuauoduo) 


saotid qaodut ‘saotad ,eq 
-tded *Aqytatqonpoid ‘sasey 


sasueys 
aBequesiad AT1eaR 


QUuON 
aUON 


9UON 


aUuON 


8eT 
peanqgtsistp YydAoy 


(uot3zerD5 
-aidap qydaoxea) Bet 
peanqtzystp yxIAoy 


s3elI ON 


sasueyd 
asequooied AT1eaR 


saqe1 
qso19qul 


yooys Aauow 
Jo squsuoduo) 


uotynqti3stq 


“ 
sqso9 
@TQetien 
1oao dn-yiseyy 


snouasopua 
osTe suoting 
-T1quod ‘aoue 
-Insut [Tetd0S 


saz 


-suel}? 4 sexe. 


awoout azei10od 
-109 4 [euos 
-i1ed ‘astoxq 


ATuo 
uotyerssideq 


spuep 

-TATp awoouT 
ssoutsnqg paqe 
-1odiosutun 


soreys 
eawoout Teuos 
-1ed 19y20 4 
uotzetoo1deg 


uot 
-nqtz3sta 


s3s0d oTqe 
-T1BA I9AO 
Sut.oy1eEW 


Il I 





103995 
Are auopw 


soolid 
qonpoig 


slaysuely 
Qh saxey 


soleus 
1030984 


qandino 
1039aS 


(2,u09) 


382 Michael K. Evans 





Empirical Estimates of the Multipliers 


The latest revised version of the Wharton model was solved 
for a 10-quarter period at approximately 4% unemployment and 
93% capacity utilization, representing a situation near full 
employment, and at approximately 8% unemployment and 82% cap- 
acity utilization, representing a situation with substantial 
underutilization of resources. The French model was solved 
for an 8-year period at levels of approximately 1% and 4%, 
respectively. While these rates seem quite low relative to 
U.S. experience, the unemployment rate in France during the 
postwar period has frequently approached 1% and has never been 
as high even as 3%. The Israeli model was also solved for an 
8-year period at levels of approximately 4% and 8%, respectively. 
In the postwar period, the Israeli unemployment rate has been 
as low as 3.3%, and as high as 12% in the early years of the 
State. It briefly reached 10% in the last quarter of 1966, but 
recovered sufficiently so that the annual figures for 1966 and 
1967 are both less than 8%. Thus in all cases the postwar lim- 
its are fairly well covered, and should serve as upper and 
lower limits to multiplier values which will be encountered in 
practice. 

There are a great many different multiplier calculations 
which can be obtained by using different policy variables. How- 
ever, in order to keep the results in this paper to manageable 
size, we have restricted ouselves to the examination of the two 
most widely used multipliers, namely a change in goverment 
expenditures and a change in the personal income tax rates. 

In all cases the effects on all economic variables due to a 
change in the tax rates have been normalized by the increase in 
the tax base due to rising income over the simulation period. 

We could have calculated multipliers for a change in the money 
stock, but these were omitted because of the smal] monetary 
sector in the Wharton model and no monetary sectors in the other 
models. In all cases, the multipliers are calculated for an 


increase in economic activity. See Tables 2-7. 


Non-Linear Econometric Models 383 





throughout the rest of this paper. The multiplier calcula- 
tions are given in Tables 2-7, 


Table 2 


Changes in selected variables for a $1 billion 
increase in government purchases*: Wharton Model 


Quarter: 1 2 o a 5 6 7. 8 9 10 
Variable 


Cc 0.49 0.64 0.64 0.63 0.62 0.60 0.58 0.58 0.58 0.57 
0.49 0.64 0.64 0.60 0.57 0.58 0.59 0.60 0.61 0.63 

I OFOC Os ie Dest Osre 0-59. (0550. 0.28. 0.24) 0.21 30.48 
0200" 0546" 0251 "0545" 0°55 0.30 0-31 0.351 (0.351 0.28 

EX -0.01 -0.03 -0.05 -0.06 -0.08 -0.09 -0.10 -0.11 -0.12 -0.13 
-0.01 -0.03 -0.05 -0.05 -0.05 -0.06 -0.06 -0.06 -0.06 -0.06 

IM 0.04 0.07 0.08 0.08 0.08 0.08 0.09 0.09 0.09 0.09 
0.04 0.06 0.07 0.08 0.08 0.08 0.08 0.09 0.09 0.09 

DI SPSS eee) pelt o 151s) 4 D7 1.030598. 0794. 10.89: 90283 
0°84, 1.708 1.08, 1.00 0°95 0.92 0.94 (0.95 0.95 0.94 

X Tra eel 2 Ol 90 leis" ACTS £69" Aso2, 1057 1552 
PARA ct e0Le 2602) 1.90" 178 1275 P76 Ae. Ae TT 

P 0.02 0.03 0.05 0.06 0.08 0.09 0.10 0.12 0.15 0.15 
0.02 0.03 0.04 0.04 0.04 0.04 0.04 0.04 0.05 0.05 

Un -0.18 -0.21 -0.20 -0.18 -0.16 -0.15 -0.12 -0.10 -0.10 -0.11 


C219, 0) 265-0..27 -0.25 -025-0.21, -0 20-0518 -05 17 -O516 


*Government purchases of goods and services were increased by $1 billion in 
1958 dollars. Government employment was raised by 0.1 million, and output 
originating in the government sector was increased by $0.5 billion. The 
top rows of numbers are the figures at 4% unemployment; the bottom rows 
are for 8% unemployment. 


where: 
C = Total consumption expenditures, billions of 1958 dollars 
I = Gross private domestic investment, billions of 1958 dollars 
EX = Exports of goods and services, billions of 1958 dollars 
IM = Imports of goods and services, billions of 1958 dollars 
DI = Personal disposable income, billions of 1958 dollars 
X = Gross national product, billions of 1958 dollars 
P = Implicit GNP deflator, 1958 = 100.0 
Un = Rate of unemployment, percent 


Table 3 


Changes in selected variables for a decrease in 
personal income tax rates*: Wharton Model 


quarter: 1 2 3 4 5 6 ih 8 9 10 
ariable 


C 0.62 0572 0.74 0.77 0.79 0.82) 0.83 0 SSO yaenOnes 
0.59 0.68 (0.69 0.70 O.72 (01.74 (07/9! SOM BiS ORG memos 

I O.13 0-45 0.45 0.40 0.56 O55 0054.5 OS Zico mmml arc 
0.03 0.417 0.43 0.36 0.33 (0,34 0.365 (0; CS On oommolnao 

EX -0.01 -0.03 -0.05 -0.07 -0.08 -0.09 -0.11 -0.12 -0.14 -0.16 
-0.01 -0.03 -0.05 -0.05 -0.05 -0.06 -0.07 -0.07 -0.07 -0.08 

IM 0.106 0.08 0.09 0.00 O.00 O20) ON 25 OS Cremona 
0.06 0.08 0.08 0.08 0.109 0.1/0 Of D1 (OND 25 ORS etic 

DI 1538 1.49 2.52 2.50) 1.50 1.49) 1490 | Alomar 
1.29 135. 1237 1.35 1.356. 1.358) 1, 43s Verano ue 

X 0.69) 1.05 1.05) 0.99 0.96 0.95) (0.93) sO; OS OR SO RGmcie 
0.64 0.99) 0.98 0.92 0189 0.92 0.987) Te 0 SiO ieee 

P 0.02 0,02! 0)..04 0.05 0/07 J0'.08) 0!509) VO 2 GR ism eeeo 
0.02 (0.01! 0.02 0,02 0:10 0/02) (0025 (00 Z2OnUSemeno- 

Un -0.03 -0.07 -0.07 -0.06 -0.05 -0.04 -0.03 -0.03 -0.03 -0.03 


-0.03 -0.08 -0.08 -0.06 -0.06 -0.05 -0.05 -0.05 -0.05 -0.05 


*The slope of the personal income tax rate was decreased by the amount cor- 
responding to a decrease in personal income taxes of $1.0 billion in 1958 
dollars for 1968.3 levels of income. 

All variables are the same as in Table 2. All figures are normalized for 
the increase in the tax base. 


Table 4 


Changes in selected variables for a $1 billion franc 
increase in government purchases*: French Model 





Year: 1 2 3 4 5 6 7 8 
Variable 
c ORAS) 1051 VOSS) e0)eSOm 0K S2em Oly syn Oye Ome rere 
nti RA owe) ry WarA (oily yal) abs 
I =02.0/2) T0525 (0206) (OS) s0e10) 1002 OO Leonor 
=05 03" (0/3212 -=0'5 016) 10). 17” (ORi07/ NOR0Z8=00103" 006 
EX -0'.33 =0).29-0.23 -0.26.-01.24 -0.22 —0.24 =0)724 
=0,.35 -0.27 =0.22 -0..26-01..22 —0)22))-0). 22 0leas 
IM OS2Z75 10465) O32 NONSA TOS 2a Orc. OrenO orc Ome Ofer 
eA SWBO MOGI WS) Wor Up ve 740) On dke 
DI O64 ON iZ O47. OA GOA 7) a0 SSO) 4-4 Oe) 
On66! VOR44 VOR 27 (0.42 032) ORZOF SOR ZS aORal 
x 5st SO WR Wet OES KOSS Was) Wav 
OSS" OSI O.7 5) O89) 08a) VOR Ge 0 7a ORG 
P O49) MON SI 075) 06S (ORGS a OnOOn mOl9 men Omor 
O49) 10-56 (O48 055) 0175.0) OS On Soe Olio, 
Un -0.87 -1.18 -1.10 -1.08 -0.95 -0.81 -0.74 -0.66 


= ay 25) —0)1917 1205 101915) OS oi Olea ae O lair 


*Government consumption is increased by 1 billion francs in 1958 
francs. Output originating in the government sector increases 
by 0.6 billion and government employment increases by 0.6 mil- 
lion. The top rows of numbers now represent the figures at 
1% unemployment: the bottom rows are for 4% unemployment. 

All variables are now in billions of 1958 francs; the unemploy- 
ment figures are now in tens of thousands of workers. 


Table 5 
Changes in selected variables for a decrease in 


personal income tax rates*: French Model 
Year: 1 2 3 4 5 6 7 8 
Variable 
Cc O56) 10-70) 0-164) 075.) 078), 40/85) | .0)..910) 01.98 
Oe 55) 06S) Oe 5S 0/159 0)164) 10/68) 10) 75) 1079 
I “5072 0225 Wo0s Wnl4 Wail) Meath) Waa Waitt! 
=502 Wea <WEOi OCOS M508 Ws07 “WeOuy Way 
EX -0.14 -0.11 -0.04 -0.08 -0.07 -0.06 -0.06 -0.06 
-0.14 -0.14 -0.03 -0.07 -0.07 -0.07 -0.07 -0.07 
IM OAotG W988 Os29 O25 Case Wo4k0 Wats Oahu 
Hels Oss Wes WSAS) WSO) WL52° Woevt Maks 
DI ep Sees Seles Ome les SO la 7/ Feel Sez 109 2529 
ese Smee TON Ne SG) M50) et 64. 9 a'-7.9) 97 
x Ole24 ee O44 Oh Sl O4)S) 0) AGO) 4 7 Ok Se 0ly5,5 
0.28 OWoc WoO Woe? OWsSS Wee Wes) Wcd'v 
P OO) W522 Wei! Ws0E Meno Wetwe) Westy Wcals 
Ooi Osly W909 MoWE OCH Oo WoW Waly 
Un -0.21 -0.44 -0.34 -0.39 -0.37 -0.33 -0.32 -0.29 


=e =O 5o) —O52 =O55) —0l55 0654-054) — 0/55 


*The slope of the personal income tax rate was decreased by the 
amount corresponding to a decrease in personal income taxes of 
1 billion francs in 1958 francs for 1965 levels of income. 
All variables are the same as in Table 4. All figures are nor- 
malized for the increase in the tax base. 


Table 6 


Changes in selected variables for a 1 million IL 
increase in government purchases*: Israeli Model 


Year: 1 2 3 4 5 6 7 8 
Variable 

C WOSe  AeS ie e200 250) S08) Sit Soe Aral 5 10)5 
ROSS hepsi Ole el SO elt, OA mele pom select Gre ite SOc Oln 

I Oe SOMOS Se Oca OlrcOM Oh Ss O02) eOL0/ie 1006 
Wea “Ws ils Wik Only Woy Osws; Wool =onW7 

EX -0.09 -0.21 -0.40 -0'.67 -1.05 -1.52 -2.08 -2.85 
-0.09 -0.22 -0.34 -0.50 -0.68 -0.88 -1.08 -1.31 

IM O09) 0:05) 10).02, —0F0'8) = 0).7272, 07495 0167, 0/5918 
0209 ©60! 0/2" 1000) —0F 06-0. 45-0223 -0)34 -0746 

DI NHS PZrer LEE® ZENO i Se Stenle?, were Collen Ay.) S10 caer OIS ane 5307,0) 
Gwe ALY Slo Gah Weal AS SAO Pen), 7G al 

X DA Zia DS)) ASD Selle eS tS) we Oly > Oly S105 Aive> 
ol ey Basis Wo sul “esl eae Aarne PS) 

P ORO 4 ORS O28 O44 ONOZ (OS AON 26 
Dae eM Woil Omit On Waa ie Weg) 

Un 0208S. 097) — 12 22 ee Sy lS Sy Siz 


=a S067 SOs! Wavy “Warts Seo SWsOil SW5 s7/ 


*Non-defense purchases of the government increased by IL 1 mil- 
lion (government employment and output change endogenously). 
The top rows of numbers again represent 4% unemployment; the 
bottom rows are for 8% unemployment. 

All figures are now in millions of 1964 IL; the price deflator 
is based 1964 = 1000.0, and the rate of unemployment is the 
percent multiplied by 100.0. 


386 Michael K. Evans 





; Table 7 
Changes in selected variables for a decrease in 
personal income tax rates*: Israeli Model 
Year: 1 2 3 4 5 6 7 8 
Variable 
ic 44S 2),04) 2)./515) Sie) Sta 474 02a 2 
WaS4 WG 1.183: Wa92 97) 9:2) CO sO cmmeercs 
I Oe Was Wa Wars Way Waal Wawro). ie 
O82 0.48 0.36 0528 (020 (0)m8 sOkee7 Ors 
EX -0.12 -0.30 -0.53 -0.82 -1.18 -1.64 -2.07 -2.64 
-0.12 -0.30 -0.45 -0.60 -0.76 -0.92 -1.07 -1.23 
IM 0.23 0.15 0.07 -0.05 -0.20 -0.40 -0.64 -0.89 
0.21 0.09 0.04 -0.04 -0.13 -0.23 -0.30 -0.41 
DI 240) 92.816.) 3.36 05-84 04), 33) 04). 051 0 Smee ome) 
225) 92.29) 25135) .2'..38) 22/40) 92:4) sehr eee 
X 97) 82), 22) AN 02) (64) 82). (8)2) eS 100218 ONO nl 
1584) 2.7/4. 70) 65. 155 Salsa eee 
P a We acl We WAG Week Woes: eile 
eV Cos WRI OAIS Ont Weil Walks O15 
Un =n 23-159. -1.48))-1 51) =. 5) 70) 142) elem 


-1.14 -1.08 -1.04 -0.94 -0.84 -0.75 -0.69 -0.62 
*The slope of the personal income tax rate was decreased by the 
amount corresponding to a decrease in personal income taxes of 
1 million IL in 1964 pounds for 1965 levels of income. 
All variables are the same as in Table 6. All figures are nor- 
malized for the increase in the tax base. 

There are a few salient features of the multipliers merit- 
ing brief comment: we consider the Wharton model first. Both 
the government expenditure and tax rate multiplier are some- 
what lower than those reported elsewhere for the U.S. economy 
[4,5,7] and are also somewhat smaller than those calculated 
for earlier versions of the Wharton model. This latter di£- 
ference occurs because in this version there is substantially 
more price increase per unit change in output. Since prices 
rise substantially, the net foreign balance decreases more, 
which reduces the multiplier somewhat. The change in housing 
is also negative, as price effects outweigh income effects. 
As expected, the tax change multipliers are smaller than the 
government expenditure multipliers: the ratio is about 0.6 
after 10 quarters. This ratio increases slightly over time as 
the mpc approaches the apc in the long run. 

Both of the multipliers are somewhat larger for the 8% un- 


Non-Linear Econometric Models 387 





employment rate, which is the expected result. The difference 
does not become noticeable until the eighth quarter, however. 
Before that the 4% multipliers are slightly higher, although 
the differences are quite small. The main difference, caused 
by higher personal disposable income at the 4% level, is due 
in turn to a less severe cutback in manhours near levels of 
full employment than during the recession. If this difference 
were adjusted, the multiplier differentials would be even 
larger. 

The largest differences between the two solutions are appar- 
ent by examining the movements of the price deflators. Where- 
as at the 4% unemployment level a 1% increase in real GNP is 
accompanied by a 0.7% increase in prices, at 8% unemployment 
prices rise by only 0.2%. This has a substantial effect on 
exports, which decrease twice as much at 4% unemployment as at 
8%. For each 1% increase in GNP, the net foreign balance de- 
clines 2% at high levels of activity compared to 1% at low 
iewediss 

The multipliers for the French model are somewhat surprising 
because of their very low values. The French economy is a 
very open one with respect to foreign trade. The income elas- 
ticity of imports is almost 3.0, and of exports is approximately 
-1.0 (higher capacity utilization lowers exports) which reduce 
the multiplier sharply. In addition the price elasticities for 
LOReIpiMEnadesanrelsubstantiale—- only Ono Lon ampores but Is3 
for exports - which also lowers the multiplier values. The 
marginal propensity to consume is estimated to be approximately 
0.6 after 8 years, which is lower than figures obtained for 
other countries. Investment decreases over time after reach- 
ing a peak after one year both because of the stock-adjustment 
formulation of the equation and because a rise in prices raises 
the velocity of money, which reduces investment. 

Comparing the multipliers at 1% and 4% unemployment, it 
becomes clear from Table 4 and 5 that the multipliers at full 


employment are larger than those at slack capacity. Income is 


388 Michael K. Evans 





distributed in favor of wage earners and away from the rela- 
tively large government sector. The figures for personal dis- 
posable income, especially for the change in government expend- 
itures, show this very strongly. Disposable income in constant 
prices rises almost twice as much at 1% as at 4% unemployment. 
Wages rise quickly (there is a shorter lag structure here than 
in the other two models) but consumer prices do not fully re- 
spond, both because so many services continue to be sold at 
fixed prices set by the government and because import prices 

do not rise. Thus wage earners are relatively better off, not 
at the expense of pensioners, whose payments are closely re- 
lated to price increases, but at the expense of those who wish 
to see the franc as a stable international currency. There is 
also a slight decrease in fixed business investment, so that 
the real growth rate is somewhat retarded. 

The magnitude of these effects can be seen by comparing the 
results of a 1% change in output at high (1% unemployment) and 
low (4% unemployment) levels of economic activity. Near full 
employment, each 1% incremental change in GNP due to an increase 
in government spending will raise prices by 3% and will decrease 
the net foreign balance by over 4%. Even at 4% unemployment, 
prices will rise by almost 2% and the net foreign balance will 
decline by 2 1/2%. The increments will not be as great for 
the change in tax rates, both because the multipliers are smal- 
ler, and because unemployment decreases proportionately less so 
that the economy is not pushed so far along the Phillips curve. 
Even so, the increase in prices and deterioration in the net 
foreign balance is still considerable. 

The multipliers for the Israeli model at 8% unemployment are 
rather similar to, although slightly larger than, those for the 
Wharton model. However, those at 4% are much larger. First, 
it might seem surprising that a small country such as Israel, 
which seems dependent on world trade for economic viability, 
would have such large multipliers, However, Israeli imports 
are very income-inelastic and are related instead to export 


Non-Linear Econometric Models 389 





requirements. Raw materials, food, fuel (until 1967), and 
export-related services (such as shipping, insurance and sim- 
ilar expenses) comprise almost all of total imports; other 
consumer goods are less than 5% of the total. Furthermore, 
many of these goods are restricted by extremely high tariffs 
(up to 400%) or by quotas, so that the realized income elas- 
ticity is not very high. It can be seen in Tables 6 and 7 that 
imports actually decrease when output rises. This does not 
imply a negative income elasticity per se but rather reflects 
the fact that, when exports decrease, imports of raw materials 
for manufactured goods also decrease. The very large decreases 
in exports are due both to a high price elasticity for goods 
(1.1, excluding diamonds) and a very sensitive substitution 
between export and domestic production depending on the purchas- 
ing power of the domestic market. 

Second, the much larger multipliers near full employment 
merit some further comment. The increase is completely due to 
the additional rise in disposable income and hence in consump- 
tion; investment increases very slightly (housing increases, 
but fixed business investment decreases) and the net foreign 
balance declines. The difference in price increases is strik- 
ing; prices increase five times as much near full employment by 
the end of the 8-year period, at which time they are rising 3% 
for each additional l%increase in GNP. While this increase is 
somewhat less than the rate of inflation for the French economy 
near full employment, it is still substantial enough to cause 
government-provided services and imported goods to become rela- 
tively much less expensive, thus raising real income. Further- 
more, there is no evidence that the spread between price and 
variable costs widens near full employment is Israel, unlike 
the other economies studied in this paper. Thus real income of 
wage earners does not decrease for this reason. Virtually all 
types of fixed income are linked either to the dollar or to the 
cosit-of-living andex. The net resullit of all these effects is 
that the incremental change in personal disposable income after 
eight yedrs is more than twice as great at 4% as it is at 8% 


390 Michael K. Evans 





unemployment, and there are similar changes in consumption. 

The net foreign balance deteriorates considerably near full 
employment: every 1% increase in GNP decreases it by 1 1/2% 
(relative to an average of imports and exports) which eventually 
leads to devaluation. This of course increases the price of 
imported goods, which decreases real personal disposable income 
and consumption. The effects of devaluation are not considered 
in our simulations. 

Except for the 4% unemployment multipliers for the Israeli 
economy, all the multipliers reach their peaks in the first or 
second period and then gradually decline, although there may 
be some oscillations later. Such results run counter to elemen- 
tary multiplier analysis, which suggests that the multiplier 
rises over time because the mpc does. However, this effect is 
always offset by stock adjustments in other sectors of the model, 
including fixed business investment, inventory investment, hous- 
ing (in the Israeli model) and consumer durables. There is no 
a priori reason why these effects will not counterbalance the 
increases in the mpc for nondurables and services. This appears 
to be the case for all multiplier calculations except the full- 
employment Israeli case, where disposable income continues to 


increase very rapidly. 


Conclusion 


In this paper, I have reported on the results of simulations 
of three different econometric models at various levels of econ- 
omic activity. One purpose of this study was to demonstrate 
that traditional multiplier analysis cannot be applied to com- 
plex nonlinear econometric models, and the overly simplified 
linear approximations of an earlier day are also invalid. In 
particular, the effects of fiscal policies on output, employment, 
prices, net foreign balance and other dimensions of economic 
activity are quite different depending on the particular posi- 
tion of the economy relative to full capacity and employment. 


Non-Linear Econometric Models 391 


These differences are due to many various interactions in the 
model which may act in opposing directions. Thus the results 
may be quite complex and may not agree with those expected from 
simplified prototype explanations. In order to assess all these 
diverse movements in proper perspective, it is necessary to 

rely on the detailed logical framework of a nonlinear econome- 
tric model. It is hoped that the results in this paper provide 
further support for this position. 


Bibliography 


1. Evans, M.K. "Multiplier Analysis of a Postwar Quarterly 
U.S. Model and a Comparison with Several Other Models," 
Review of Economic Studies, (October, 1966). 


2. Evans, M.K. “A Short-Term Forecasting Model of the French 
ECOMeMDy Oo” PekeiSe Oolis(Cn, 5 woes 

3. Evans, M.K. "An Econometric Model of Part of the Israeli 
Economy ,'' Econometrica, (October, 1970). 
vania. 


4, Evans, M.K. Macroeconomic Activity: Theory, Forecasting, 
and Control. New York: Harper §& Row Publishers, Inc., 


1969. 


5. Evans, M.K. ''Reconstruction and Estimation of the Bal- 
anced Budget Multiplier,"' Review of Economics and Stat- 
tsyaes,, (February, 1969). 


6. Evans, M.K., and Klein, L.R. The Wharton Econometric 
Forecasting Model. 2nd enlarged edition. Philadelphia: 
University of Pennsylvania, 1968. 


7. Fromm, Gary, and Taubman, Paul. Policy Simulation with an 
Econometric Model. Washington, D.C.:The Brookings Insti- 
HUE TON LOOSE 





8. Hickman, Bert. Dynamic Properties of Macroeconometric 
Models: An International Comparison," prepared for SSRC 
Conference, "Is the Business Cycle Obsolete?" 

9. Klein, L.R. The Keynesian Revolution. 2nd ed. New York: 
Macmillan Co., 1986, 

10. Nerlove, Marc. "A Tabular Survey of Macro-Econometric 


Models,'' International Economic Review, (May, 1966). 


11. Norman, Morris. "Solving a Non-Linear Econometric Model," 
paper delivered at the December 1967 Econometric Society 
Meetings. 


392 Michael K. Evans 





12. Verdoorn, P.., and Post, J.J. “Capacity and Short-vexrmemar= 


tipliers,'' Econometric Analysis for Economic Planning. 
Edited by P.E. Hart, G. Mills, and J.K. Whitaker. Pro- 
ceedings of the Sixteenth Symposium of the Colston Re- 
search Society. 


Footnote 


* This method was introduced to economics by Lawrence 
Erdman in his M.A. thesis at M.I.T. For a further discussion 


of this method see [11]. 


Complex Organizational 
Processes 


Martin Pfaff, American University 


Introduction 


Problems of Validity and Inference in the Simulation of Com- 
plex Hierarchically-Organized Systems 


Most simulations of real-world systems that involve elements 
of control within the simulated processes can be characterized 
as hierarchically-organized systems: Processes of control gen- 
erally are superimposed on the technological processes, say, 
of a factory, industry or economy. Moreover, many of the sim- 
ulations of large-scale business or industrial processes -- 
whether formulated for the purpose of research or training -- 
are characterized by some variant of a hierarchically-organized 
system with multiple linkages and a host of feedback relation- 
ships. These relationships are generally introduced by the 
model builder who is concerned with the validity of overly simp- 
lified models in the context of the real-world complexity. And, 
indeed, it may be held that such an approach does reduce the 
the problem of validity to a more tolerable level. Yet, in thus 
solving one major problem, another problem -- that of inference 
-- is increased: The trade-off between "validity and inference" 
-- or, in other terminology, that between realism and formalism 
-- implies the need for a strategic decision on the part of the 
model builder in the design phase of a simulation experiment. 

This paper reports on a method developed for those cases 
where the model builder has opted in favor of "realism," at a 
cost of increasing the problem of inference to limits not cov- 
ered by traditional inferential procedures. Examples of such 
strategic design choices are found in complex all-computer sim- 


ulations involving many system-levels, random time lags and/or 


394 Martin Pfaff 





interactions, together with a generally high level of "noise" 
introduced into the simulated system by exogenous or endogenous 
random elements; or, in the case where the model builder elects 
to represent one aspect of the system by human actors, as he 

is not able to represent their behavior adequately within the 
computer model. Game-simulations (or man-computer simulations) 
of the latter type are employed mostly for purposes of training, 
and occasionally for purposes of research; both types involve 
problems of measurement and inference, in either evaluating the 
success of the training vehicle or in testing some hypotheses 
about the system. 

It will be recognized that traditional optimization proce- 
dures -- such as linear or dynamic programming -- are not ap- 
plicable to the analysis or solution of such complex systems. 
However, the experimenter's insight into the structure of the 
hierarchically-organized system can be utilized, together with 
a set of inferential heuristics which make possible (1) the "de- 
composition" of the hierarchically-organized system into a set 
of components whose behavior is inferred separately (employing, 
in turn, a set of sequential analysis procedures), and (2) the 
re-assembly of components into the overall system, that is, the 
inference of overall system-behavior from the inferred behavior 
of the components. These heuristics and their associate compu- 
tational procedures will be described below. 

This approach to the study of complex systems carries the 
rationale for simulation -- that is, the representation of com- 
plex systems even when analytical methods are cumbersome if 
not outright inapplicable -- one step further. In so doing, 
it decreases the problems associated with validity, while sug- 
gesting heuristics for processes of inference for which algor- 


ithems have not been developed so far. 


Complex Organizational Processes: Definitions and Assumptions 


A univariate organizational process X, is defined as a 
string of random variables, that is, for each value of t (equals 
time), X is a random variable. A multivariate organizational 


Complex Organizational Processes 395 


process. in) turn, 1s) defined asi a’set of strings of nandom var- 


iables which pertain to an interrelated hierarchical system. 
iG X is a discrete organizational process, an organizational 


pattern of size m occurs if an ordered arrangement of values 
recurs after m time periods. The values of this realization of 
the process may or may not, in fact, depend on the value of t, 
depending on whether Xt is or is not stationary. 

For purposes of the analysis methods that will be described, 
it was assumed that each variable depends on one or more other 
variable(s), including time (t), and a stochastic term (u) 
which is 


(a) independent of time, 

(b) has a mean zero, 

(c) is not autocorrelated, and 
(d) of fixed variance. 


In short, 

(8) (ey) = O 

(2) E[u - Bau) = es and 

(3) E[u, a E(u) }[u,,, = (vi) I} = © s #0 


In real-world situations, often only a single realization of 
the process under study is available. The properties of the 
process therefore have to be estimated on the basis of a sample 
of one; this is possible if the process is stationary or can be 
transformed into a stationary process (e.g., by detrending of 
the series under investigation). 

A process is termed stationary in the wider sense if (a) the 
first two moments of Xt (that is, mean and variance) are con- 
stant for all t; and (b) the covariance between xX, and Pan 
(s#0), that is the autocovariance of X, is only dependent on s 
and not on the value of t. 


The Problem of Inference Posed by Stochastic Processes Involv- 
ing Human Actors within the Computer Simulation 
Available statistical techniques for the design of simula- 


tion experiments -- whether they are response surface designs, 


sequential designs, or whether they are based on multi-variate 


396 Martin Pfaff 


OOO —————————————M 


analysis -- may be termed as "one level" analysis techniques: 
They aid in inferring relationships between the input and out- 
put variable(s) of a given process. However, in the case of 
hierarchically-organized systems, the measurements taken on the 
output of one level -- say the "'technological level" of an or- 
ganization -- may be influenced by the output values of higher- 
level control processes. 

In Figure 1, a highly simplified representation of an ideal- 
ized hierarchically-organized system is depicted. This system 
may be seen as a representation of a firm: The technological 
system is concerned with the physical transformation of raw 
materials into final outputs, while the control system -- the 
middle-management level -- serves as a regulator, which in turn 
is controlled by the command system, or top management. If the 
relationships between these system-levels are of a deterministic 
type with built-in lags and nonlinearities, methods of analytical 
description and solution are generally available. Specifically, 
calculus of variations, dynamic programming, and Pontryagin's 
maximum principles, come to mind. Even if some of the rela- 
tionships are of stochastic nature, optimal solutions may be 
found. 

Those problems of analysis, however, that are of greatest 
practical importance, are often characterized by limited if 
not total absence of information on some of the linkages between 
the subsystems that constitute the overall system. Occasionally, 
only input values of a higher-level process and output values of 
a lower-level process are known while it is understood that 
there must exist some intervening linkages which have great im- 


portance for the system dynamics of the lower-level or con- 


trolled sub-system. This type of "noisy" system can therefore 
be approached more readily with the help of simulation. Often 


these higher-level processes which pertain to human behavior are 
so complex that the researcher attempts to model them by employ- 
ing human actors who interact with the lower-level process. 
Methods of game-simulation of hierarchically-organized systems 


Complex Organizational Processes SoM 


Level |— Command Level 


Level 2 — Control Level ee 


Level 3- Technological Level fal ds 


Figure 1. Hierarchically structured organization 


therefore represent a special class of complex organizational 
simulations. 

As long as the means for interaction -- through communica- 
tions and control -- are highly structured and limited, only 
ene special problem of inference (beyond those already de- 
scribed) is posed: Individuals may or may not react to the set 
of stimuli (information) to which they are exposed, depending 
upon their subjective valuation of such information. 

The most difficult case, is provided, however, when the in- 
teraction patterns between the human actors (if not between 
the human and non-human actors modeled within the computer) are 
not closely controlled. A loss of information occurs therefore 
which, together with the 'noise'" introduced into the system, 
makes the problem of inference very difficult indeed. 

Even if the computer-component of the system is expressed 
in the form of a deterministic model, three types of stochastic 
processes will be noted, 


(1) from the subject's point of view, his environment will 
be characterized by a stochastic process whose pattern he will 
attempt to infer intuitively (or occasionally, analytically) ; 
and, 


398 Martin Pfaff 





(2) from the experimenter's point of view the subjects' be- 
havior will appear as a stochastic process, insofar as the 
subjects' choice of a subset of the total information set and 
of a subset of the total control set from his potential cogni- 
tive set will appear as nondeterministic; and, 


(3) from the experimenter's point of view the behavior of 
the total organization, consisting, say, of a group of human 
actors and of a group of simulated actors ("robots") will ap- 
pear as a stochastic process, as long as he is not privy to 
all interactions taking place between the human actors, even 
if the relationships with the simulated actors are completely 
described. 


The nature of these three types of stochastic processes will 
be briefly explored. 


Stochastic process type 1. Even if it is assumed that a 
human actor will behave rationally when placed into a complex 


novel environment, it will be evident that his behavior cannot 
be understood in a "deterministic" manner. The main reason 
for this phenomenon is simply that an individual does not have 
enough time to learn the interrelationship between all control 
and performance variables with which he is concerned. 

By viewing the process of selection as a random process -- 
which assumes the absence of any prior information on the part 
of the decision maker as to the impact of control on perfor- 
mance variables -- the magnitude of the information processing 
task can be ascertained for the case of n control variables 
and m performance variables. 

Step 1: In the most simplified view, a rational decision 
maker may be expected to exercise one control variable (cy) 
picked at random; thereafter he may be expected to observe the 
impact of this control action on one performance variable (py); 
picked at random. Changes in the level of performance, in turn, 
will have an impact on the level of control exercised by the 
decision maker. 

Step 2: The decision maker adds a second control variable 
(c5) to the one he already exercised; after observing the sep- 
arate effect of this variable (c,), he studies the combined 


effect (C1 5C5) of these two variables. 


Complex Organizational Processes 399 





Step 3 to step n: The decision maker proceeds analogously 
to Step 2, by adding further and further variables, until the 
nth -- or last -- variable has been included in the scheme. 
This inclusion implies also the testing of all subsets or of 
aul: C21) combinations of control variables. 

Step n+2 to step n+tm: The decision maker includes further 
performance variables and observes the impact of the G21) 
Combinations of control variables on the increasing set of 
performance variables. 

This approach would force the decision maker to look at 
(2"-1)m different subsets of variables provided he drops none 
of the performance variables he has included in his observa- 
tion. Evidently this number becomes rather unmanageable even 


with a rather restricted set of controls and performances. 


Stochastic process type 2. As the individual subject learns 
to select a subset of performance variables and other informa- 
tion his behavior will become increasingly more stable; he will 
tendmeauexenrcise (certain control variables) in response to cer- 
tain variables only, and this linkage between performance-in- 
formation and control action may involve lags of various dur- 
ation. 

An experimenter who peruses the record of control actions 
taken, and of the information potentially available to the 
subject, 15) Still faced with the probilem of inferring rella- 
tionships as perceived by the subject. Only a stochastic pro- 
cess model will be found adequate for this purpose in the course 


of most complex organizational simulations. 


Stochastic process type 3. Even if and when the experimenter 


has built a stochastic model of the behavior of the individual 
human actors, the behavior of the total organization -- that is, 
of all human actors together with the robot-actors within the 
computer model -- appears as stochastically-determined. The 
sheer number of possible interactions between the various com- 
ponents -- of which the observed patterns are only one reali- 


zation of a large number of possible interaction-combinations 


400 Martin Pfaff 





-- imposes the need of viewing total organizational processes 


in a stochastic frame of reference. 


"Decomposition and Reassembly" Heuristics 


Overall Research Procedure 


In the absence of an acceptable algorithm to tackle problems 
of such complexity, a set of research heuristics was formulated. 
In their most simple description, they may be termed ''decompo- 
sition and reassembly heuristics" as they involve two overall 
research steps: 

(1) Decomposition implies (a) an arrangement of organiza- 
tional measures into subgroups that are conceptually distinct 
in terms of the simulated system (e.g., in a managerial simula- 
tion the distinction into control, information, communication, 
environmental, and performance measures suggests itself), and 
(b) an analytic separation of the total system into all major 
subsystems at which transformation of specified inputs into 
outputs takes place (in terms of the above example, the trans- 
formation of communication, control and information into per- 
formance, which lies at the individual-manager level); and, 

(2) Reassembly implies the use of inferential or intuitive 
methods to derive relationships between and across subsystems 
-- termed "organizational patterns" -- in so far as they in- 
volve empirical regularities of the type defined earlier. 

Within both phases a sequential analysis may be employed. 
Particularly in the decomposition phase (where along with anal- 
ytical questions, the problem of data reduction has often to be 
handled) a sequential analysis design will serve best. 

The initial step of decomposition involves an aggregation 
of data to the level desirable in terms of the accuracy re- 
quired, on the one hand, and the mechanical problems of gener- 
ating data sequences that can be handled in the statistical 
analyses, on the other. A compromise will frequently be re- 


quired. 


Complex Organizational Processes 401 





Since any analysis is concerned with the "'causal" linkage 
of events the first task is to identify which events should be 
of interest for the given analysis. Even though some of the 
causal linkages may be unknown, it is valid to investigate the 
effect of one or more events on another event though they may 
not alone or not directly cause this other event. In terms of 
this conceptual grouping certain subsystems emerge as the cru- 
cial “agents or Stations of ‘transtormation.” vit visi this subi- 
system level at which analysis should then be carried out. 

For each subsystem its total of k organizational measures 
aver Tabte should then be arranged into subgroup kK), ky, k., 
Sees = k), such that each K; corresponds to one set of 
events (in terms of the above discussion) that belong concep- 
tually together. 

k may be chosen quite large provided it is known that quite 
a number of the k variables have no observations at all, be- 
cause they provided an option which was not exercised through- 
out the simulation. This will be desirable, because it may 
render it possible to use an identical set of k variables for 
all subsystems. 

A frequency count can be employed to reduce the abstract 
set k to a subset that reflects the actually important vari- 
ables. 

This causal impact of one or more sets k; of variables on 
the variables of another set, say k., may then be inferred by 
a technique such as, for example, multiple linear regression; 
or -- if a generation of best explanators is desired -- a step- 
wise regression. This procedure gives rise to an "average" 
type of relationship. If the series under analysis are a real- 
ization of an oscillatory process the fit of the regression 
cannot be expected to be very good. However, it can be used 
to select according to some decision rule (like number of ex- 
planators, etc.) the most important explanators of a variable. 

Since there is evidence of and reason to believe that or- 
ganizational processes are oscillatory in nature, it is de- 
sirable to describe the properties of the processes by means 


402 Martin Pfaff 


of spectral analysis and to relate individual variables to 
their best explanators one by one by means of cross-spectral 
analysis, or to all of them by partial cross-spectral analysis. 
These methods may provide a closer association, particularly 
in some frequency bands. 

The reassembly phase consists in the evaluation of the re- 
sults of these statistical analysis and in the establishment 
of certain regularities across variables and subsystems. 

The organizational patterns that may emerge may be many- 
fold; 

(1) The type of variable may differ or be the same from 
subsystem. 

(2) The number of variables per subsystem may vary. 

(3) The same dependent variables may have the same or dif- 
ferent major explanators in different subsystems. 

(4) Within the same subsystem, certain explanators may be 


more frequent than others, etc. 


Correlation and Regression Analysis 


The sequential techniques involved in an analysis sequence 
described above can be correlation-, multiple regression-, 
spectral-, and cross-spectral analysis. 

A very short and superficial discussion only of these sug- 
gested techniques will be offered. 


(a) Simple correlation coefficients provide a measure for 
the closeness of and direction of the linear association be- 
tween two variables. The value of the correlation coeffici- 
ent varies between 1 and -1. A correlation coefficient of 1 
indicates a close direct association, one of 0 no linear as- 
sociation, one of -1 a close inverse linear relation, 


As estimator of this standardized measure of association 
the following formula is usually employe: 


TR Sa 


Vv arx? - (2x? Vary? - (zy) 


In the analysis simple correlation coefficients may be used 
for three purposes: 


yy = 


(1) Some general information about pairs of variables, 


Complex Organizational Processes 403 


ee eee eee ee 


the direction and extent of their linear relationship can 
be obtained; 


(2) The correlation of a variable with time in an or- 
ganizational and behavioral context provides some indica- 
tion of the learning effect; and, 


(3) The simple correlation coefficient can be used to 
estimate the linear regression coefficients, if the nature 
of the input series does not permit the application of 
least squares estimators, due to singularity of the data 
matrix or gaps in data series. 


(b) Multiple linear regression provides one of the most 
widely used statistical prediction methods involving multi- 
variate processes. 


The underlying assumption is that a dependent variable Y 
is linearly dependent on a set of k variables X1,...,Xk and a 
serially uncorrelated stochastic term u with zero mean and 
Cons tantavandlancel tc), 


k 
ven yy B.X; + u 
i=l 
E(u) = 0 var u= ae 
cov(u, ,U,4,.) = 0 s #0 


A widely employed method for estimation of the values of 
B> is that of least squares. It involves a minimization of 
the squared deviation of the estimated value OY OVO) eae ome Yee 
This process, however, involves the inversion of the matrix 
(X'X) of the product of the transpose of the matrix of values 
of X55 with the matrix of values of X;- (x9 7} clearly exists 
only if (X'X) is non-singular. Not all data Siatdstyetneks) ae> 
quirement. 

In some simulation analyses, an added difficulty may be 
posed by incomplete series, i.e., by gaps in the data. A 
least squares routine could not handle these types of data 
matrices. Therefore an estimation routine has to be employed 
in such cases which can circumvent this problem. 

The Gauss-Seidel method, which has only the restriction that 
the main diagonal of the input matrix has to have non-zeros 
and the number of observations have to be equal to the number 
of explanatory variables, can be employed in such cases. By 
using the intercorrelation matrix of the k variables X; and Y 


404 Martin Pfaff 
eee 


as input these requirements are easily fulfilled. The result- 


ing estimated standardized partial regression coefficients, 


i.e., the regression coefficients of the system 
} 
T= Vree Tare 
yi j=l Ieee) 


where Ty 4 are the estimates of the correlation coefficients 
between Y and X; and Ti the estimates of the correlation coef - 
ficients between X; and X.; y. are the regression coefficients 
of this system, which can be transformed into estimates of Bs 
of the system Y = EBX; by multiplying ta by the standard de- 
viation of Y and dividing through by the standard deviation 

of X., Le l@ as 


Biches 8 Yea 


S 
where Sy is the standard deviation of Y and S the standard 
deviation of X;- 


Spectral and Cross-Spectral Analysis [1] 


As is well-known an oscillatory process Xe with regular per- 


iodicity can be approximated, by Fourier's Theorem, by 
© 
Mg EE Da ” A, (cos iwtte,) 
1=1 
where 27/w is the frequency of X,> or the phase shift of the 
ith component, and A; the amplitude of the ith component. 
Should the process X, be strictly non-periodic, i.e., not 
deterministic but stochastic in nature, an extension of 
Fourier's analysis is needed. This is provided by a technique 
known as spectral analysis, which can be applied to autocovari- 
ance-stationary processes, i.e., processes with the properties 


of having no trend in mean, variance, and autocovariance: 


E(X,) =e 


2 


y2 2.62 


(3) var(X,) = E(X, = mM 


cov(X »X 2) (= BIG) Co. ane ae 


Complex Organizational Processes 405 


Processes with trends in mean or variance can be analyzed 
by performing spectral analysis on a transformation of them. 
As Kramer, Komolgoroff, Wiener and others have shown, a 


complex covariance-stationary process can always be expressed 
by 


(2) 0 fet dz (a) 
where z(w) is a complex process of non-correlated increments, 
Cis 
0 
(3) Bl2(wy) - 2(02))(2Cm53) = 2(44))*] = ECy,) -F(w,) 


Oy 7W57Wz7W4 
W1=Wz W2=Wy 
(* asterisks indicating complex conjugates), and consequently 
(4) E[dz(w,)dz(w,)*] = ° Oo gale 
f£ (w) Wy = WW. = w 
F(w) is the power spectral distribution function of the 
process Xi and by the stipulated property of X, being non- 
periodic F(w) = f(w)dw, where f(w) is continuous on -T<w<T. 
The autocovariance of such a process can be expressed by 


its Fourier transform 


(5) wu, = [Tete (w) du 

A real process X, can be expressed by 
(6) ieee Nf cos tw du(w) + ie sin tw dv(w) 
where u(w) and v(w) are real functions and z(w) = 1/2(u(w) + 
iv(w)). Furthermore for a real process uU, = U_, and conse- 
quently f(w) = f(-w); for a real process thus 
(7) Sa Mee (aydis O<u<n 

= ft et 5(w) dw w= 0,7 
T = 

(8) J_,f(w)d = var(X,) 


and f(w)dw is the contribution to the overall variance of xX 


by the component of X, of the frequency band (w,wtdw). 


406 Martin Pfaff 


Since in many cases only one realization of X, is available 
for estimation of f(w) this would amount to estimating f(w) 
from a sample of one. The properties of stationarity, however, 
allow this, since the relevant magnitudes (mean, variance, and 
covariance) are constant over time. Since, however, a reali- 
zation of X, is, normally, of finite length, the problem arises 
that only a finite number of lags t can be taken and os esti = 
mated; since f(w) has to be estimated from the inverse of (5) 
or (7), respectively, only a finite number of values of f(w) 
can be estimated, each of which pertains to a frequency band 
around the point of estimation. 


Inversion of (7) yields 


co 


(9) fte) = te ) U.cos Tw 
t=0 


Actual estimators of (9) are of the form 
a = Alt r jit 5 

(10) £(w,) a Lo RC OE O<j<m 
where f(w;) is the estimator of f(w) around frequency a Me is 
a filter which varies between routines of estimation and is 
employed to obtain a more consistent estimator; u ls the esti- 
mator of wos om is the maximum lag chosen. 

A high value of £(w,) implies a large contribution by the 
frequency band around w, to the overall variance of X, -- and 
a large value of z(w4), the amplitude. For xX, this implies a 
dominance of the component of frequency Wy: Care has, however, 
to be exercised in the interpretation of spectral peaks. For 
some spectral estimators confidence bands are available. An- 
other, more crude way of affirming the validity of spectral 
peaks is to look for peaks in the estimates of the harmonics. 

It may be interesting to analyze the impact of one time- 
varying process on another not merely in an "average" fashion, 
as provided by linear regression methods, but rather a compon- 
ent by component type impact in the sense of a spectral anal- 


ytic decomposition of the bivariate process (X,5¥,)- 


Complex Organizational Processes 407 


2 eee 


The covariance-stationary bivariate process (XY) is de- 
fined by the properties: 


E(X,) =m E(Y,) =m 


Xx y 
var(X,) = E(X,-m,)°=62 var(Y,) = E(Y,-m,)? = ‘ 
cov (X, Re) = E[ (X,- nh CC m,)] ily Gs) 
COWIE ee ihe L(Y, mye My] = vy me) 

(11) cov(X,5Yi47) = ELOQy- re) vasa ee 1 0) 
cov(Y,,X,,,) = EL(Y,- my) (Xe m,)] a V(t) 


u Oa aa 
AOL > tee eo) 
The spectral representation of the autocovariances and cross- 


covariances is 

Wet) 

eZ, T) = 
(12) Hyy & ) 


u ce) = jr eit an 


es nS (w)dw 
inte 


oe (w)dw 


where cr(w) is called the power cross-spectrum and 
Gs) ex (9) = C(@)) ce se) (a) 


c(w) is called the cospectrum; it is the real part of ceria). 
q(w) is called the quadrature spectrum; it is the imaginary 
part of the cross-spectrun. 

As in the univariate process X, described above the ampli- 
tudes of different frequency components are uncorrelated in 
the bivariate process (X,5Y,)- 

Tet E(X,) = E(Y,) = 0 this implies 

E(du, (w)) = E (du, (o)) = E(dv, (w)) = Edy) = 0 
(14) E(du, (w)) du, (¥,)) = E(dv, (o,) dv, (,) = 0 w,#W> 
2 £(w) 


W,=W,=W 


nae 
E (du, (7) du = CD ee = 0 w 1 FW> 


2 £,(w) 


rn 


408 Martin Pfaff 
i ee ee 


E(du, (w, du, (w>)) = E(dv, (wy dv (oy) = 0 w#u, 
= 2c(w) 


Ole 


0 W1FW > 
2q (w) W1 =W=w 


(14) E(du, (w,)dv, (w)) 


E(du, (wy )dv, (w.)) 0 w1 FW > 


-2q(w) Wy =W 4 = 


By inversion of (12) f,(), fy (w), and cr(w) are derived: 


E(w) = = Uxy (0) + + L_Oxy™ cf Wy (7) cos Tw) 
(15) Ge 
ls : | 
CHC IBS Yxy - Wy (T) sin tw) 


The estimators of c(w) and q(w) contain filters similarly 
as described for the estimators of f(w). 

The cross-spectrum and its components are not per se of 
interest. Certain magnitudes derived from it, however are 
important. | 

1. The coherence square defined by the formula, 


2 ae 7 
(16) Comers oe : = 
Nata, 


which will always be between zero and one, since the inequality 
(e-(a)+q" (oy) < f (wf (w) holds is a measure of linear associ- 
ation (corresponding ce the coefficient of determination) of 
components of equal frequency of Xt and Ye The coherence 
diagram is a plot of Co(w) over w. Its pattern may be very 
INteKeS ting LOrithe vamalkyssnist: 

2. The phase shift between components of x and Yt of fre- 
quency w is defined by 


In this form it measures the lag between the components of 
equal frequency of x and bs in fractions of a £ulll circle 2m) 
The phase diagram is a plot of w(w) over w, whose pattern 

is of very great interest and importance. 


| 
(17) Y(w) = arc tan q(w)/c(w) 


Complex Organizational Processes 409 





3. The gain or transfer function Tyy (o) corresponds to a 
regression coefficient of the component of frequency w of Xx 


on that of Yes It is defined by 
£ (w)cr (w) 


2 
(18) Ty (8) = ae 


Essentially the relationship between xX and Y; is assumed 
to be of the form 


(49) iG cos tw du,(w) + Ihe sin tw dv,(w) = 
pes yl [cos (tw- ¥(w) du, (w) + sin(tw- v(w))dvy (w)]d 


The pee techniques are one possible set of sae 
methods that can be employed sequentially in the decomposition 
phase of the analysis of data generated by simulations of com- 
plex hierarchically structured organizations. An analysis 
sequence of this general form was found most applicable in the 
statistical analysis of simulation data generated by the Leviathan 
simulation of a complex organization carried out at the System 
Development Corporation in Santa Monica by Beatrice Rome and 
Sydney Rome. 


Summary and Conclusion 


This paper reported on an attempt to use statistical infer- 
ence in the analysis of complex organization processes; these 
were characterized by multiple hierarchical interrelations, 
loss of information on the interaction of components, and a 
high level of "noise" within the overall system. 

In the absence of any known algorithm for the analysis of 
such processes, a set of "decomposition and reassembly" heur- 
istics was formulated. These research procedures "decompose" 
systems into overall components (this is called the "decompo- 
sition phase"); thereafter, the behavior of these components 
is inferred with the help of sequential procedures. The be- 
havior of the overall system is inferred from the behavior of 


the components (this is called the "reassembly phase"). 


410 Martin Pfaff 


-—-_—_-_-_-: ere 


Two types of component and system behavior are inferred: 
(1) the average behavior of the system is estimated -- in the 
macroeconometric vein -- in the form of a set of multiple re- 
gression equations; Gauss-Seidel methods are found applicable 
both for the estimation of the individual equations and for the 
solution of the total system of equations; (2) time-dependent 
behavior patterns are inferred with the help of spectral and 
cross-spectral analysis techniques. The use of these heuristics 
is illustrated by the analysis and interpretation of the be- 
havior of a complex organization. The data for the analysis 
subjects were derived from the results of experiments, based on 
the "Leviathan" approach to the design of complex organization 
simulation experiments, developed at the System Development 
Corporation, Santa Monica, California. 


Bibliography 


1. Granger, C.W.J. Spectral Analysis of Economic Time Series. 


Princeton, N.J.: Princeton University Press, 1964, 

2. Pfaff, Martin. A Methodology for the Measurement of Com- 
plex Organizational Learning Processes. East Lansing, 
Mich.: Computer Institute for Social Science Research, 
Michigan State University, May, 1968. 


3. Pfaff, Martin and Anita. Organizational Processes and Pat- 
terns (in preparation for publication). 


4. Ralstone, A., and Wilf, H.S. Mathematical Methods for Dig- 
ital Computers. New York: John Wiley §& Sons, Inc., 1960. 


5. Tukey, John W. "The Future of Data Analysis," Annals of 
Mathematical Statistics, XXXIII (1962), 1-67. 


Simulation Techniques 


(Abstract) 


Richard E. Trueman, San Fernando 
Valley State College 


There are many different ways in which the efficiency and 
utility of computer simulations can be improved. Efficiency 
can, of course, be defined by various measures, depending on 
the type of system to be simulated, the objectives of the 
simulation study, and the actual mechanization (programming) 
of the simulation model. The measures considered here are 
Minimization of the variances of sample means, minimization 
of the time for individual computer runs, and reduction of the 
number of computer runs required. The particular variance- 


reduction schemes discussed are: 


1. Correlated sampling, which is extremely useful 
when making relative comparisons; 

2. Antithetic variate sampling, an important tech- 
nique in the efficient determination of absolute, rather 
than relative, values; and 

3. Selective sampling, a sampling-without-replacement 
procedure which ensures that the relative frequency of sim- 
ulated events will conform as closely as possible to their 


given probability of occurrence. 


Topics involving improvement in simulation programming and 
experiments include the elimination of unnecessary events in 
an event-oriented simulation model, the elimination of unnec- 
essary accuracy requirements, and the early termination of 
non-productive runs in heuristic simulation models. 

In regard to the utility of computer simulations, topics 


discussed are presentation of results using histograms, dy- 


412 Richard E. Trueman 
a ee 
namic determination of the amount of information to be printed 
out, and the adaptation of an existing program to an entirely 
different problem. 

No attempt is made at comprehensive coverage of these top- 
ics; the emphasis is on practical applications of certain 
techniques, primarily those the author has found particularly 
useful. Several of these applications involve queuing pro- 
blems, an area where a great deal of productive simulation 
effort has been applied. 


Multi-Dimensional 
Verification (Abstract) 


Stanley F. Stasch, Northwestern University 


This paper focuses on the question: What is the effect on 
simulation verification if both disaggregate and aggregate 
measures, instead of only aggregate measures, are used? An 
experiment, devised to study the question, used a number of 
hypotheses purporting to explain the innovation diffusion pro- 
cess and fabricated data reflecting a small, artificial popu- 
lation. Both aggregate and disaggregate measures were gener- 
ated in the simulation of each hypothesis. An attempt was 
then made to identify those disaggregate measures which were 
most capable of verifying the simulation. Although the results 
were inconclusive, the experiment raised a number of questions, 


two of which were: 


1. What is the role played by chance in determining 
the outcome of a simulation? 
2. What effect does the number of iterations have on 


simulated results? 


Answers to these questions must be forthcoming before it is 
possible to determine if the inconclusive results were caused 
by an inadequate number of iterations, by the chance aspects 
of the phenomenon being simulated, or by some other deficiency 
in the simulation. 

A second study was undertaken to determine the number of 
iterations required for the stabilization of the various ag- 
gregate and disaggregate measures used in the first study. 

If certain measures used in the verification process had not 


stabilized, they would, quite obviously, be unreliable and 


414 Stanley F. Stasch 


---s—-l OSS _—_—_ 


hence of little use in the verification process. In the 
second study, the stability of both aggregate and disaggregate 
measures were observed by recording those measures at frequent 
intervals during the simulation. Some aggregate measures were 
found to stabilize in a relatively small number of iterations, 
as did some of the disaggregate measures. However, many dis- 
aggregate measures did not stabilize until after a large num- 
ber of iterations, and some did not stabilize at all. 

Possibly the most interesting conclusion is that which sug- 
gests that simulation verification must take place in a multi- 
dimensional space. This study treated levels of disaggrega- 
tion and sample size as two dimensions in such a space. But 
the verification space may not be limited to two dimensions! 
Most simulations are possible only if the researcher makes 
some statements or assumptions concerning the conditions exist- 
ing at the beginning of the simulation. That such initial 
conditions affect the simulated results is an opinion held by 
many, if not most, researchers interested in simulation. It 
may therefore be appropriate to suggest that initial conditions 
form the third dimension delineating the simulation verifica- 
tion space. Furthermore, there are probably researchers who 
would argue that still other dimensions should be added to the 


verification space. 


Life Insurance Models 
(Abstract) 


Russell M. Collins, Jr., Minnesota Mutual 
Life Insurance Company 


This paper describes an application of simulation models 
to life insurance problems. 

First, some simple Monte Carlo simulltion models of the 
mortality experience of a group of lives developed by Messrs. 
John Boermeester, Sidney Benjamin, and the author are de - 
scribed. Then a somewhat more complicated simulation model 
of a life insurance company reinsurance pool is described in 
some detail. 

This leads naturally to a discussion of the corporate model 
of a life insurance company. The advantages of possessing 
such a model are discussed as are the uses to which such a 
model might be put. Some of the characteristics of such a 
model are cited as well as some of the problems to be encoun- 
tered in the endeavor to construct it. Recent developments 
in life insurance industry in this area as well as in the 
construction of agency financial models are described bagike nts lbyare 

One approach to the construction of a corporate model of a 
life insurance company utilizing a management game 1S GlitSe 
cussed. 

The author expresses the hope that the presentation of this 
paper at the Symposium will not only bring material of inter- 
est before the participants but also that they will be stimu- 
lated to apply their expertise in this field to some of the 


problems raised in these applications. 


Risk Theory Models 
(Abstract) 


John M. Boermeester, John Hancock Mutual 
Life Insurance Company 


A life insurance company, like any business enterprise, is 
subject to the risks of random and nonrandom events. Unlike 
most industrial organizations, a life insurance company is 
primarily organized to assume certain types of risks for a 
consideration. Company decisions involve quantitative values 
which concern premiums, dividends, surplus, reserves and other 
financial values. The discussion herein is directed to the 
distribution of losses which arise because of random fluctua- 
tions in claim rates. 

A brief description is given of several problems which are 
related in the sense that they posses similar risk distribu- 
tion characteristics. In each instance, the answer to the 
problem is very sensitive to amounts associated with variables 
which lie in the right-hand tail of a frequency distribution. 
The determination of these frequency distributions by analyti- 
cal means at best is very difficult due to the complex inter- 
relationship to the variables. Hence, a simulation procedure 
is suggested as a means of solution. 

For an illustration of principles, an analysis is given for 
the following hypothetical stop-loss reinsurance problem: 

A life insurance company provides one-year term insurance 

for a group of lives. The sum insured and the probability 

of death differ for each life. A proposal is offered by 
another insurer to provide a reinsurance benefit equal to 

a fraction of the total yearly claims which might arise in 

excess of a stated amount, subject to a maximum total rein- 

surance benefit. What is the amount of premium which should 


be charged to provide for benefits which may be traced to 
random mortality fluctuation alone? 


i aia 


Risk Theory Models 417 





This problem is specifically discussed with respect to a 
group of 223 lives with insurance benefits ranging from $2,000 
to $200,000 and probabilities of death ranging from .001 to 
.05. The solution is based upon a modification of a forced 
sampling technique described by S. Benjamin. This technique 
is based on a compound probability function which involves the 
distribution of the number of deaths and the distribution of 
individual claim amounts. The distribution of deaths is 
established in accordance with a Poisson function with param- 
eter equal to the expected number of deaths. The distribution 
of total claim amounts is established by a Monte Carlo proce- 
dure under which members of the group are successfully "killed". 
A random number is generated to determine whether or not a 
life is to die according to the conditional probability that 
this life dies first among the survivors. A distribution of 
the reinsurance benefits based upon the analysis of 100 years 


of simulated experience is displayed. 














aia 








MATHEMATICS LIBRARY 








il 





wil 


| 


